Programmatic Access to the Standards Explorer
The Bridge2AI Standards Explorer data is hosted on Sage Synapse and is accessible through the Synapse REST API. All data tables are public, so no API key or authentication is required for read access.
Overview
The Standards Explorer consists of several public Synapse tables:
- DataStandardOrTool (
syn63096833): Main table containing all standards, tools, and resources - DataTopics (
syn63096835): Topics/domains that standards concern (e.g., EHR, Genomics, Image) - DataSubstrate (
syn63096834): Data formats and structures (e.g., JSON, CSV, BIDS) - Organization (
syn63096836): Organizations related to standards (e.g., HL7, W3C, CDISC) - DataStandardOrTool_denormalized (
syn65676531): Flattened view for easier querying
All tables belong to the Bridge2AI Standards Explorer project (syn63096806).
REST API Access
The Synapse REST API endpoint is: https://repo-prod.prod.sagebase.org
Basic Query Structure
To query a Synapse table, you need to:
- Start an async query job
- Poll for results using the returned token
- Retrieve the final results
Example: Query Standards Table (cURL)
# Step 1: Start the async query
curl -X POST https://repo-prod.prod.sagebase.org/repo/v1/entity/syn63096833/table/query/async/start \
-H "Content-Type: application/json" \
-d '{
"concreteType": "org.sagebionetworks.repo.model.table.QueryBundleRequest",
"entityId": "syn63096833",
"query": {
"sql": "SELECT * FROM syn63096833 LIMIT 10"
},
"partMask": 29
}'
# This returns a token like: {"token": "12345"}
# Step 2: Poll for results (repeat until status is not 202)
curl https://repo-prod.prod.sagebase.org/repo/v1/entity/syn63096833/table/query/async/get/12345
# Step 3: Parse the results from the JSON response
Example: Search by Name (cURL)
# Search for standards with "FHIR" in the name
curl -X POST https://repo-prod.prod.sagebase.org/repo/v1/entity/syn63096833/table/query/async/start \
-H "Content-Type: application/json" \
-d '{
"concreteType": "org.sagebionetworks.repo.model.table.QueryBundleRequest",
"entityId": "syn63096833",
"query": {
"sql": "SELECT id, name, description FROM syn63096833 WHERE name LIKE '\''%FHIR%'\'' LIMIT 5"
},
"partMask": 29
}'
Example: Query Denormalized Table
The denormalized table includes expanded columns for easier querying:
curl -X POST https://repo-prod.prod.sagebase.org/repo/v1/entity/syn65676531/table/query/async/start \
-H "Content-Type: application/json" \
-d '{
"concreteType": "org.sagebionetworks.repo.model.table.QueryBundleRequest",
"entityId": "syn65676531",
"query": {
"sql": "SELECT * FROM syn65676531 WHERE concerns_data_topic_names LIKE '\''%Genomics%'\''"
},
"partMask": 29
}'
Python Access
Simplest Example (requests library)
Here's the most minimal example to search the Standards Explorer:
import requests
import time
def search_standards(search_term):
"""Simple function to search standards by name."""
# Start query
response = requests.post(
"https://repo-prod.prod.sagebase.org/repo/v1/entity/syn63096833/table/query/async/start",
json={
"concreteType": "org.sagebionetworks.repo.model.table.QueryBundleRequest",
"entityId": "syn63096833",
"query": {"sql": f"SELECT id, name, description FROM syn63096833 WHERE name LIKE '%{search_term}%' LIMIT 10"},
"partMask": 29
},
headers={"Content-Type": "application/json"}
)
token = response.json()["token"]
# Wait for results
while True:
result = requests.get(
f"https://repo-prod.prod.sagebase.org/repo/v1/entity/syn63096833/table/query/async/get/{token}",
headers={"Content-Type": "application/json"}
)
if result.status_code != 202: # 202 = still processing
return result.json()["queryResult"]["queryResults"]["rows"]
time.sleep(1)
# Use it
for row in search_standards("FHIR"):
values = row["values"]
print(f"{values[0]}: {values[1]}")
# Output:
# B2AI_STANDARD:109: FHIR
# B2AI_STANDARD:845: CDC Introduction to FHIR
# B2AI_STANDARD:846: FHIR Drills
Note: The Synapse API requires an async query pattern (start query → poll for results), but the above function wraps this complexity for you.
Using httpx (Async)
This example shows how to query the API without the Synapse Python client:
import httpx
import asyncio
import json
SYNAPSE_BASE_URL = "https://repo-prod.prod.sagebase.org"
TABLE_ID = "syn63096833"
async def poll_async_job(client, table_id, async_token, max_wait=30):
"""Poll an async job until it completes or times out."""
url = f"{SYNAPSE_BASE_URL}/repo/v1/entity/{table_id}/table/query/async/get/{async_token}"
headers = {"Content-Type": "application/json"}
start_time = asyncio.get_event_loop().time()
while True:
elapsed = asyncio.get_event_loop().time() - start_time
if elapsed > max_wait:
raise TimeoutError(f"Query timed out after {max_wait} seconds")
response = await client.get(url, headers=headers)
# 202 means still processing
if response.status_code == 202:
await asyncio.sleep(1)
continue
response.raise_for_status()
return response.json()
async def query_standards(sql_query):
"""Query the Standards Explorer table."""
query_request = {
"concreteType": "org.sagebionetworks.repo.model.table.QueryBundleRequest",
"entityId": TABLE_ID,
"query": {"sql": sql_query},
"partMask": 0x1 | 0x4 | 0x10 # queryResults + selectColumns + columnModels
}
async with httpx.AsyncClient(timeout=60.0) as client:
# Start the async query
start_response = await client.post(
f"{SYNAPSE_BASE_URL}/repo/v1/entity/{TABLE_ID}/table/query/async/start",
json=query_request,
headers={"Content-Type": "application/json"}
)
start_response.raise_for_status()
# Get the async token
async_token = start_response.json().get("token")
# Poll for results
result_bundle = await poll_async_job(client, TABLE_ID, async_token)
# Extract rows
rows = result_bundle.get("queryResult", {}).get("queryResults", {}).get("rows", [])
columns = result_bundle.get("selectColumns", [])
return {
"columns": [col.get("name") for col in columns],
"rows": rows,
"row_count": len(rows)
}
# Example usage
async def main():
# Get all standards (limited to 10)
result = await query_standards("SELECT * FROM syn63096833 LIMIT 10")
print(f"Found {result['row_count']} standards")
# Search for FHIR-related standards
result = await query_standards(
"SELECT id, name, description FROM syn63096833 WHERE name LIKE '%FHIR%'"
)
for row in result['rows']:
values = row.get('values', [])
print(f"ID: {values[0]}, Name: {values[1]}")
# Run the async function
asyncio.run(main())
# Expected output:
# Found 10 standards
# ID: B2AI_STANDARD:109, Name: FHIR
# ID: B2AI_STANDARD:845, Name: CDC Introduction to FHIR
# ID: B2AI_STANDARD:846, Name: FHIR Drills
Using requests (Synchronous)
import requests
import time
SYNAPSE_BASE_URL = "https://repo-prod.prod.sagebase.org"
TABLE_ID = "syn63096833"
def query_standards(sql_query, max_wait=30):
"""Query the Standards Explorer table synchronously."""
query_request = {
"concreteType": "org.sagebionetworks.repo.model.table.QueryBundleRequest",
"entityId": TABLE_ID,
"query": {"sql": sql_query},
"partMask": 29 # Request all parts
}
# Start the async query
response = requests.post(
f"{SYNAPSE_BASE_URL}/repo/v1/entity/{TABLE_ID}/table/query/async/start",
json=query_request,
headers={"Content-Type": "application/json"}
)
response.raise_for_status()
async_token = response.json().get("token")
# Poll for results
start_time = time.time()
while True:
if time.time() - start_time > max_wait:
raise TimeoutError(f"Query timed out after {max_wait} seconds")
result = requests.get(
f"{SYNAPSE_BASE_URL}/repo/v1/entity/{TABLE_ID}/table/query/async/get/{async_token}",
headers={"Content-Type": "application/json"}
)
if result.status_code == 202: # Still processing
time.sleep(1)
continue
result.raise_for_status()
return result.json()
# Example usage
result = query_standards("SELECT id, name, description FROM syn63096833 WHERE name LIKE '%FHIR%' LIMIT 5")
rows = result.get("queryResult", {}).get("queryResults", {}).get("rows", [])
for row in rows:
values = row.get("values", [])
print(f"ID: {values[0]}, Name: {values[1]}")
# Output:
# ID: B2AI_STANDARD:109, Name: FHIR
# ID: B2AI_STANDARD:845, Name: CDC Introduction to FHIR
# ID: B2AI_STANDARD:846, Name: FHIR Drills
Using the Synapse Python Client
The official Synapse Python client provides a simpler interface:
import synapseclient
# Login (anonymous access for public data)
syn = synapseclient.Synapse()
syn.login()
# Query the table
query = "SELECT * FROM syn63096833 WHERE name LIKE '%FHIR%' LIMIT 10"
results = syn.tableQuery(query)
# Convert to pandas DataFrame
df = results.asDataFrame()
print(df.head())
# Access specific columns
print(df[['id', 'name', 'description']])
# Example output:
# id name \
# 0 B2AI_STANDARD:109 FHIR
# 1 B2AI_STANDARD:845 CDC Introduction to FHIR
# 2 B2AI_STANDARD:846 FHIR Drills
#
# description
# 0 Fast Healthcare Interoperability Resources
# 1 CDC Introduction to FHIR - Training...
# 2 FHIR Drills
Installation:
pip install synapseclient
Using the Synapse CLI
Query tables directly from the command line:
# Install synapseclient
pip install synapseclient
# Query and save to CSV
synapse query "SELECT * FROM syn63096833 LIMIT 10" > tmp.csv
# Clean the result (remove header rows)
tail -n +3 tmp.csv > standards.csv
rm tmp.csv
Common Query Examples
List Available Topics
First, see what topics are available:
import requests
import time
def query_topics():
result = query_standards("SELECT id, name FROM syn63096835 LIMIT 10")
rows = result.get("queryResult", {}).get("queryResults", {}).get("rows", [])
for row in rows:
values = row.get("values", [])
print(f"{values[0]}: {values[1]}")
# Output:
# B2AI_TOPIC:1: Biology
# B2AI_TOPIC:2: Cell
# B2AI_TOPIC:3: Cheminformatics
# B2AI_TOPIC:4: Clinical Observations
# B2AI_TOPIC:5: Data
# B2AI_TOPIC:6: Demographics
# B2AI_TOPIC:7: Disease
# B2AI_TOPIC:8: Drug
# B2AI_TOPIC:9: EHR
# B2AI_TOPIC:10: EKG
Search by Topic
-- Find all standards related to genomics (topic ID: B2AI_TOPIC:5)
SELECT id, name, description
FROM syn63096833
WHERE concerns_data_topic LIKE '%B2AI_TOPIC:5%'
LIMIT 5
# Example output:
# B2AI_STANDARD:114: Genomics Operations
# B2AI_STANDARD:127: FuGE-ML
# B2AI_STANDARD:128: FuGEFlow
# B2AI_STANDARD:141: GCDML
Search by Organization
-- Find standards from HL7 (org ID: B2AI_ORG:48)
SELECT id, name, description, url
FROM syn63096833
WHERE responsible_organization LIKE '%B2AI_ORG:48%'
OR has_relevant_organization LIKE '%B2AI_ORG:48%'
Search by Data Substrate
-- Find standards that work with JSON (substrate ID: B2AI_SUBSTRATE:58)
SELECT id, name, description
FROM syn63096833
WHERE has_relevant_data_substrate LIKE '%B2AI_SUBSTRATE:58%'
Filter by Category
-- Find all biomedical standards
SELECT id, name, description
FROM syn63096833
WHERE category = 'B2AI_STANDARD:BiomedicalStandard'
Full-text Search
-- Search across multiple text fields
SELECT id, name
FROM syn63096833
WHERE description LIKE '%genomic%'
LIMIT 5
# Example output:
# B2AI_STANDARD:44: SDTM
# B2AI_STANDARD:114: Genomics Operations
# B2AI_STANDARD:127: FuGE-ML
# B2AI_STANDARD:128: FuGEFlow
# B2AI_STANDARD:141: GCDML
Filter by Properties
-- Find open standards
SELECT id, name, is_open
FROM syn63096833
WHERE is_open = 'true'
LIMIT 5
# Example output:
# B2AI_STANDARD:1: .ACE format (Open: true)
# B2AI_STANDARD:2: DMS (Open: true)
# B2AI_STANDARD:3: ABCD (Open: true)
# B2AI_STANDARD:4: AGP (Open: true)
# B2AI_STANDARD:5: AnIML (Open: true)
Query the Denormalized Table
For easier querying with human-readable values, use the denormalized table:
-- Search by topic name instead of ID
SELECT id, name, concerns_data_topic_names
FROM syn65676531
WHERE concerns_data_topic_names LIKE '%Genomics%'
import synapseclient
syn = synapseclient.Synapse()
syn.login()
# Query denormalized table with readable column names
query = """
SELECT id, name, category_label, concerns_data_topic_names,
responsible_organization_names
FROM syn65676531
WHERE concerns_data_topic_names LIKE '%EHR%'
"""
df = syn.tableQuery(query).asDataFrame()
print(df)
Rate Limits and Best Practices
- No authentication required for public tables (read-only access)
- The API uses async queries - always poll for results rather than expecting immediate responses
- Set reasonable timeouts (30-60 seconds for most queries)
- Use LIMIT clauses to avoid retrieving too much data at once
- The partMask parameter controls which parts of the query result are returned:
0x1(1): Query results0x4(4): Select columns0x10(16): Column models0x1D(29): All parts (commonly used)
Additional Resources
- Synapse REST API Documentation
- Synapse Python Client Documentation
- Standards Explorer MCP Implementation
- Standards Explorer Project on Synapse
Support
For questions or issues: - Open an issue on the b2ai-standards-registry GitHub repository - Contact the Bridge2AI Standards team