Properties

The get_properties() function allows the retrieval of specific properties without having to deal with entire compound records. This is especially useful for retrieving the properties of a large number of compounds at once:

p = pcp.get_properties("SMILES", "CC", "smiles", searchtype="superstructure")

Multiple properties may be specified in a list, or in a comma-separated string. The available properties are: MolecularFormula, MolecularWeight, ConnectivitySMILES, SMILES, InChI, InChIKey, IUPACName, XLogP, ExactMass, MonoisotopicMass, TPSA, Complexity, Charge, HBondDonorCount, HBondAcceptorCount, RotatableBondCount, HeavyAtomCount, IsotopeAtomCount, AtomStereoCount, DefinedAtomStereoCount, UndefinedAtomStereoCount, BondStereoCount, DefinedBondStereoCount, UndefinedBondStereoCount, CovalentUnitCount, Volume3D, XStericQuadrupole3D, YStericQuadrupole3D, ZStericQuadrupole3D, FeatureCount3D, FeatureAcceptorCount3D, FeatureDonorCount3D, FeatureAnionCount3D, FeatureCationCount3D, FeatureRingCount3D, FeatureHydrophobeCount3D, ConformerModelRMSD3D, EffectiveRotorCount3D, ConformerCount3D.

Synonyms

Get a list of synonyms for a given input using the get_synonyms() function:

pcp.get_synonyms("Aspirin", "name")
pcp.get_synonyms("Aspirin", "name", "substance")

Inputs that match more than one SID/CID will have multiple, separate synonyms lists returned.

CAS Registry Numbers

CAS Registry Numbers are not officially supported by PubChem, but they are often present in the synonyms associated with a compound. Therefore it is straightforward to retrieve them by filtering the synonyms to just those with the CAS Registry Number format:

for result in pcp.get_synonyms("Aspirin", "name"):
    cid = result["CID"]
    cas_rns = []
    for syn in result.get("Synonym", []):
        match = re.match(r"(\d{2,7}-\d\d-\d)", syn)
        if match:
            cas_rns.append(match.group(1))
    print(f"CAS registry numbers for CID {cid}: {cas_rns}")

Identifiers

There are three functions for getting a list of identifiers for a given input:

For example, passing a CID to get_sids() will return a list of SIDs corresponding to the Substance records that were standardised and merged to produce the given Compound.