pandasgwas.summary_statistics

Functions for easy retrieval of summary statistics data based on FTP data.

from pandasgwas import summary_statistics
#Search the index based on PubMed_id, study_accession_id, and EFO_trait_id. The indexed results will be returned as a DataFrame.
search_DF = summary_statistics.search(PubMed_id='27918534', study_accession_id='GCST003966')
#Based on the index results, view the data directory on the browser.
summary_statistics.browser(search_DF)
#Based on index results, download summary statistics data in $Home/pandasgwas_home.
summary_statistics.download(search_DF)
#Based on the index results, load the data from $Home/pandasgwas_home and convert it into a DataFrame.
df = summary_statistics.parse(search_DF)

browser

browser(
    search_DF: DataFrame, interactive: bool = True
) -> None

See where the data is stored in the browser.

Parameters:
  • search_DF (DataFrame) –

    A DataFrame that stores the FTP storage location. Obtained from calling the search function.

  • interactive (bool, default: True ) –

    Whether to make interactive prompts.

Returns:

download

download(search_DF: DataFrame) -> None

Download FTP data to directory $HOME/pandasgwas_home.

Parameters:
  • search_DF (DataFrame) –

    A DataFrame that stores the FTP storage location. Obtained from calling the search function.

Returns:

parse

parse(
    search_DF: DataFrame, interactive: bool = True
) -> DataFrame

Resolves the specified data from the directory $HOME/pandas_home to a DataFrame.

Parameters:
  • search_DF (DataFrame) –

    A DataFrame that stores the FTP storage location. Obtained from calling the search function.

  • interactive (bool, default: True ) –

    Whether to make interactive prompts.

Returns:
  • DataFrame

    A DataFrame that records summary statistics.

search

search(
    PubMed_id: str = None,
    study_accession_id: str = None,
    EFO_trait_id: str = None,
    online_index: bool = False,
) -> DataFrame

Search for where data is stored based on indexing and query criteria.

Parameters:
  • PubMed_id (str, default: None ) –

    ID of PubMed.

  • study_accession_id (str, default: None ) –

    ID of study accession.

  • EFO_trait_id (str, default: None ) –

    ID of EFO trait.

  • online_index (bool, default: False ) –

    Whether to use an online index.

Returns:
  • DataFrame

    A DataFrame that records where the data is stored.