sccloud.read_input

sccloud.read_input(input_file, genome=None, return_type='AnnData', concat_matrices=False, h5ad_mode='a', ngene=None, select_singlets=False, channel_attr=None, black_list=[])[source]

Load data into memory.

This function is used to load input data into memory. Inputs can be in 10x genomics v2 & v3 formats (hdf5 or mtx), HCA DCP mtx and csv formats, Drop-seq dge format, and CSV format.

Parameters
  • input_file (str) – Input file name.

  • genome (str, optional (default: None)) – A string contains comma-separated genome names. sccloud will read all matrices matching the genome names. If genomes is None, all matrices will be considered.

  • return_type (str) – Return object type, can be either ‘MemData’ or ‘AnnData’.

  • concat_matrices (boolean, optional (default: False)) – If input file contains multiple matrices, if concatenate them into one AnnData object or return a list of AnnData objects.

  • h5ad_mode (str, optional (default: a)) – If input is in h5ad format, the backed mode for loading the data. mode could be ‘a’, ‘r’, ‘r+’. ‘a’ refers to load all into memory.

  • ngene (int, optional (default: None)) – Minimum number of genes to keep a barcode. Default is to keep all barcodes.

  • select_singlets (bool, optional (default: False)) – If only keep DemuxEM-predicted singlets when loading data.

  • channel_attr (`str’, optional (default: None)) – Use channel_attr to represent different samples. This will set a ‘Channel’ column field with channel_attr.

  • black_list (List[str], optional (default: [])) – Attributes in black list will be poped out.

Returns

An MemData object or anndata object or a list of anndata objects containing the count matrices.

Return type

MemData object or anndata object or a list of anndata objects

Examples

>>> adata = io.read_input('example_10x.h5', genomes = 'mm10')
>>> adata = io.read_input('example.h5ad', mode = 'r+')
>>> adata = io.read_input('example_ADT.csv')