Bring your own data

Past synthetic_binary lies your real classification task. This page walks the full upload → run → predict flow on a CSV / parquet you already have.

What works in MVP

Feature	Status
Numeric features (int, float, bool)	✅
Binary classification	✅
Multi-class classification (≤ 20 classes)	✅
Regression (numeric target)	✅
Categorical / string features	❌ encode client-side first
Free-text features	❌
> 50 MB uploads	❌ raise to support@gnosyslabs.com

If your data has categoricals, one-hot or label-encode them in pandas before upload. Server-side categorical handling is on the roadmap — open a ticket if you'd like it bumped.

Step 1 — install + authenticate

pip install "gnosyslabs[data]"          # adds pandas + pyarrow
export GNOSYS_API_KEY=gn_live_…

The [data] extra is what makes client.datasets.upload(df, …) work with a DataFrame in memory; without it you have to upload files from disk.

Step 2 — upload your training data

import pandas as pd
from gnosys import GnosysClient

client = GnosysClient()  # reads $GNOSYS_API_KEY

df = pd.read_csv("my_classification_data.csv")
# Make sure features are numeric and the label column is named.

dataset = client.datasets.upload(
    df,
    name="my_first_dataset",
    task="classification",
    target_column="label",
)
print(dataset.dataset_id)   # tds_abc1234… — keep this for the next call
print(dataset.n_samples, dataset.n_features, dataset.n_classes)

Validation that runs server-side on the upload:

every feature column must be numeric (int / float / bool);
labels: at least 2 unique, at most 20 unique for classification;
no NaNs in feature columns;
upload size ≤ 50 MB.

If anything fails you get a 422 with a clear code + message pointing at the column.

You can list, get, and delete uploaded datasets:

for d in client.datasets.list():
    print(d.dataset_id, d.name, d.n_samples, d.created_at)

client.datasets.delete("tds_abc1234…")

Soft-delete only — the on-disk parquet is retained for audit.

Step 3 — run a Gnosys job on your data

Same shape as the synthetic-data quickstart, just with your dataset_id:

run = client.runs.create(
    domain="tabular",
    strategist={"kind": "hp_sweep", "key": "C",
                "values": [0.001, 0.01, 0.1, 1.0, 10.0, 100.0]},
    spec_template={
        "spec_id": "_t",
        "name": "first run",
        "hypothesis": "regularisation strength sweep",
        "task": "classification",
        "dataset_id": dataset.dataset_id,
        "model_family": "logistic",
        "hyperparameters": {"C": 1.0},
    },
    max_iterations=10,
)
run = client.runs.wait(run.run_id, timeout=600)
print(run.status, run.best_tier, run.best_primary_score)

Behind the scenes Gnosys's executor splits your uploaded data into train / validation / honest-eval cohorts, fits each hyperparameter point, and runs the full validation layer (multi-calibration, dist-shift, the four honest-eval verifiers, the LLM critic if you've enabled it). The best_tier / best_primary_score on the returned Run are computed on the server's internal test split.

Step 4 — predict on rows the server has never seen

For evaluation against your own held-out data, model deployment, or just scoring fresh rows in production:

new_rows = pd.read_csv("rows_to_score.csv")
preds = client.runs.predict(run.run_id, new_rows)

print(preds.predictions[:5])    # decoded class labels
print(preds.classes)             # column order of probas
print(preds.probas[0])           # [P(class_0), P(class_1), …]

The server reloads the run's promoted model (run.has_promoted_model is True whenever a model is available), applies it to your rows, returns predictions + predict_proba output. Your test rows never touch Gnosys's training side.

You can also pass:

a numpy ndarray — pass feature_names=… so the server can validate column order;
a path to a CSV / parquet file.

import numpy as np
preds = client.runs.predict(
    run.run_id,
    np.random.randn(8, dataset.n_features),
    feature_names=dataset.feature_names,
)

End-to-end script

Full notebook-ready snippet:

import pandas as pd
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split
from gnosys import GnosysClient

client = GnosysClient()
df = pd.read_csv("my_data.csv")

# Hold out 25% locally — these rows never leave your machine until
# the final scoring call.
train_df, test_df = train_test_split(
    df, test_size=0.25, stratify=df["label"], random_state=42,
)

dataset = client.datasets.upload(
    train_df,
    name="experiment_v1",
    task="classification",
    target_column="label",
)
run = client.runs.create(
    domain="tabular",
    strategist={"kind": "hp_sweep", "key": "C",
                "values": [0.01, 0.1, 1.0, 10.0]},
    spec_template={
        "spec_id": "_t", "name": "v1",
        "hypothesis": "first sweep on real data",
        "task": "classification",
        "dataset_id": dataset.dataset_id,
        "model_family": "logistic",
        "hyperparameters": {"C": 1.0},
    },
    max_iterations=8,
)
run = client.runs.wait(run.run_id, timeout=600)

X_test = test_df.drop(columns=["label"])
y_test = test_df["label"].to_numpy()
preds = client.runs.predict(run.run_id, X_test)
p_pos = [row[preds.classes.index(1)] for row in preds.probas]
print("local test AUC:", roc_auc_score(y_test, p_pos))

What if no model gets promoted?

If every spec gets blocked by the validation layer (e.g. severe calibration failures or the deception test fires), run.best_tier will be rejected_validation / rejected_below_floor, run.has_promoted_model will be False, and client.runs.predict will return 409. Read /v1/findings?run_id=… to see what was rejected and why; commonly the fix is regularisation, more data, or removing a leaking feature.

Plan caps

Plan	datasets cap	predict calls / mo
free	2	500
starter	20	10,000
team	200	100,000
enterprise	unlimited	unlimited

Hitting the cap returns 402; delete an unused dataset or upgrade.

Found a typo? Tell us.