Bring your own data
Past synthetic_binary lies your real classification task. This
page walks the full upload → run → predict flow on a CSV /
parquet you already have.
What works in MVP
| Feature | Status |
|---|---|
| Numeric features (int, float, bool) | ✅ |
| Binary classification | ✅ |
| Multi-class classification (≤ 20 classes) | ✅ |
| Regression (numeric target) | ✅ |
| Categorical / string features | ❌ encode client-side first |
| Free-text features | ❌ |
| > 50 MB uploads | ❌ raise to support@gnosyslabs.com |
If your data has categoricals, one-hot or label-encode them in pandas before upload. Server-side categorical handling is on the roadmap — open a ticket if you'd like it bumped.
Step 1 — install + authenticate
pip install "gnosyslabs[data]" # adds pandas + pyarrow
export GNOSYS_API_KEY=gn_live_…
The [data] extra is what makes client.datasets.upload(df, …)
work with a DataFrame in memory; without it you have to upload
files from disk.
Step 2 — upload your training data
import pandas as pd
from gnosys import GnosysClient
client = GnosysClient() # reads $GNOSYS_API_KEY
df = pd.read_csv("my_classification_data.csv")
# Make sure features are numeric and the label column is named.
dataset = client.datasets.upload(
df,
name="my_first_dataset",
task="classification",
target_column="label",
)
print(dataset.dataset_id) # tds_abc1234… — keep this for the next call
print(dataset.n_samples, dataset.n_features, dataset.n_classes)
Validation that runs server-side on the upload:
- every feature column must be numeric (int / float / bool);
- labels: at least 2 unique, at most 20 unique for classification;
- no NaNs in feature columns;
- upload size ≤ 50 MB.
If anything fails you get a 422 with a clear code + message
pointing at the column.
You can list, get, and delete uploaded datasets:
for d in client.datasets.list():
print(d.dataset_id, d.name, d.n_samples, d.created_at)
client.datasets.delete("tds_abc1234…")
Soft-delete only — the on-disk parquet is retained for audit.
Step 3 — run a Gnosys job on your data
Same shape as the synthetic-data quickstart, just with your
dataset_id:
run = client.runs.create(
domain="tabular",
strategist={"kind": "hp_sweep", "key": "C",
"values": [0.001, 0.01, 0.1, 1.0, 10.0, 100.0]},
spec_template={
"spec_id": "_t",
"name": "first run",
"hypothesis": "regularisation strength sweep",
"task": "classification",
"dataset_id": dataset.dataset_id,
"model_family": "logistic",
"hyperparameters": {"C": 1.0},
},
max_iterations=10,
)
run = client.runs.wait(run.run_id, timeout=600)
print(run.status, run.best_tier, run.best_primary_score)
Behind the scenes Gnosys's executor splits your uploaded data
into train / validation / honest-eval cohorts, fits each
hyperparameter point, and runs the full validation layer
(multi-calibration, dist-shift, the four honest-eval verifiers,
the LLM critic if you've enabled it). The best_tier /
best_primary_score on the returned Run are computed on the
server's internal test split.
Step 4 — predict on rows the server has never seen
For evaluation against your own held-out data, model deployment, or just scoring fresh rows in production:
new_rows = pd.read_csv("rows_to_score.csv")
preds = client.runs.predict(run.run_id, new_rows)
print(preds.predictions[:5]) # decoded class labels
print(preds.classes) # column order of probas
print(preds.probas[0]) # [P(class_0), P(class_1), …]
The server reloads the run's promoted model (run.has_promoted_model
is True whenever a model is available), applies it to your rows,
returns predictions + predict_proba output. Your test rows never
touch Gnosys's training side.
You can also pass:
- a numpy ndarray — pass
feature_names=…so the server can validate column order; - a path to a CSV / parquet file.
import numpy as np
preds = client.runs.predict(
run.run_id,
np.random.randn(8, dataset.n_features),
feature_names=dataset.feature_names,
)
End-to-end script
Full notebook-ready snippet:
import pandas as pd
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split
from gnosys import GnosysClient
client = GnosysClient()
df = pd.read_csv("my_data.csv")
# Hold out 25% locally — these rows never leave your machine until
# the final scoring call.
train_df, test_df = train_test_split(
df, test_size=0.25, stratify=df["label"], random_state=42,
)
dataset = client.datasets.upload(
train_df,
name="experiment_v1",
task="classification",
target_column="label",
)
run = client.runs.create(
domain="tabular",
strategist={"kind": "hp_sweep", "key": "C",
"values": [0.01, 0.1, 1.0, 10.0]},
spec_template={
"spec_id": "_t", "name": "v1",
"hypothesis": "first sweep on real data",
"task": "classification",
"dataset_id": dataset.dataset_id,
"model_family": "logistic",
"hyperparameters": {"C": 1.0},
},
max_iterations=8,
)
run = client.runs.wait(run.run_id, timeout=600)
X_test = test_df.drop(columns=["label"])
y_test = test_df["label"].to_numpy()
preds = client.runs.predict(run.run_id, X_test)
p_pos = [row[preds.classes.index(1)] for row in preds.probas]
print("local test AUC:", roc_auc_score(y_test, p_pos))
What if no model gets promoted?
If every spec gets blocked by the validation layer (e.g. severe
calibration failures or the deception test fires), run.best_tier
will be rejected_validation / rejected_below_floor,
run.has_promoted_model will be False, and client.runs.predict
will return 409. Read /v1/findings?run_id=… to see what was
rejected and why; commonly the fix is regularisation, more data,
or removing a leaking feature.
Plan caps
| Plan | datasets cap | predict calls / mo |
|---|---|---|
| free | 2 | 500 |
| starter | 20 | 10,000 |
| team | 200 | 100,000 |
| enterprise | unlimited | unlimited |
Hitting the cap returns 402; delete an unused dataset or upgrade.
Next
Found a typo? Tell us.