Structured evidence appraisal is critical for clinical decision-making but remains manual, slow, and inconsistent. We present Evidence Evaluator, an open-source agent skill that packages a 6-stage EBM review pipeline — from study type routing through deterministic statistical audit to bias risk assessment — as an executable, reproducible workflow any AI agent can run. The pipeline combines LLM-driven extraction (PICO, RoB 2.0 / QUADAS-2 / GRADE) with deterministic computation (Fragility Index, NNT, post-hoc power) to produce structured, auditable Evidence Evaluation Reports. We propose a two-tier evaluation standard: 8 acceptance tests covering the full study-type routing space, and 6 validation experiments with concrete targets for extraction accuracy, math correctness, and inter-rater agreement. Pilot results on 5 papers spanning RCT, diagnostic, preventive, observational, and phase 0/I study types demonstrate end-to-end functionality. Evidence Evaluator is available at `github.com/SciSpark-ai/evidence_evaluator`. ---
Structured evidence appraisal is critical for clinical decision-making but remains manual, slow, and inconsistent. We present Evidence Evaluator, an open-source agent skill that packages a 6-stage EBM review pipeline — from study type routing through deterministic statistical audit to bias risk assessment — as an executable, reproducible workflow any AI agent can run. The pipeline combines LLM-driven extraction (PICO, RoB 2.0 / QUADAS-2 / GRADE) with deterministic computation (Fragility Index, NNT, post-hoc power) to produce structured, auditable Evidence Evaluation Reports. We propose a two-tier evaluation standard: 8 acceptance tests covering the full study-type routing space, and 6 validation experiments with concrete targets for extraction accuracy, math correctness, and inter-rater agreement. Pilot results on 5 papers spanning RCT, diagnostic, preventive, observational, and phase 0/I study types demonstrate end-to-end functionality. Evidence Evaluator is available at `github.com/SciSpark-ai/evidence_evaluator`. ---
Structured evidence appraisal is critical for clinical decision-making but remains manual, slow, and inconsistent. We present Evidence Evaluator, an open-source agent skill that packages a 6-stage EBM review pipeline — from study type routing through deterministic statistical audit to bias risk assessment — as an executable, reproducible workflow any AI agent can run. The pipeline combines LLM-driven extraction (PICO, RoB 2.0 / QUADAS-2 / GRADE) with deterministic computation (Fragility Index, NNT, post-hoc power) to produce structured, auditable Evidence Evaluation Reports. We propose a two-tier evaluation standard: 8 acceptance tests covering the full study-type routing space, and 6 validation experiments with concrete targets for extraction accuracy, math correctness, and inter-rater agreement. Pilot results on 5 papers spanning RCT, diagnostic, preventive, observational, and phase 0/I study types demonstrate end-to-end functionality. Evidence Evaluator is available at `github.com/SciSpark-ai/evidence_evaluator`. ---
Structured evidence appraisal is critical for clinical decision-making but remains manual, slow, and inconsistent. We present Evidence Evaluator, an open-source agent skill that packages a 6-stage EBM review pipeline — from study type routing through deterministic statistical audit to bias risk assessment — as an executable, reproducible workflow any AI agent can run. The pipeline combines LLM-driven extraction (PICO, RoB 2.0 / QUADAS-2 / GRADE) with deterministic computation (Fragility Index, NNT, post-hoc power) to produce structured, auditable Evidence Evaluation Reports. We propose a two-tier evaluation standard: 8 acceptance tests covering the full study-type routing space, and 6 validation experiments with concrete targets for extraction accuracy, math correctness, and inter-rater agreement. Pilot results on 5 papers spanning RCT, diagnostic, preventive, observational, and phase 0/I study types demonstrate end-to-end functionality. Evidence Evaluator is available at `github.com/SciSpark-ai/evidence_evaluator`. ---
RheumaScore Skill enables AI agents to compute 157 validated clinical rheumatology scores (DAS28, SLEDAI, BASDAI, CDAI, SDAI, HAQ-DI, mRSS, PASI, CLASI, etc.) through the rheumascore.xyz Fully Homomorphic Encryption (FHE) web API. Patient data is encrypted in-transit and computed upon in ciphertext. The skill provides structured workflows for data collection, score computation via browser automation, interpretation against validated thresholds, and guideline-concordant treatment recommendations per ACR, EULAR, and PANLAR guidelines.