Deduplicate and filter training data
nemo-curatorskillsetup L3★9,423
Orchestra-Research/AI-Research-SKILLs ↗What it does
GPU-accelerated data curation
Best for
GPU-accelerated data curation for LLM training
Inputs
- · Nemo Curator requirement
- · Implementation context
Outputs
- · Implementation guide
- · Best practices
- · Reference examples
Requires
- · npm
- · Node.js
- · Jest
Preconditions
- · Understanding of Nemo
- · Appropriate development environment
Failure modes
- · Missing dependencies or incompatible versions
- · Configuration or environment issues
- · Incorrect implementation or testing gaps
Trust signals
- · Code examples provided
- · Open source licensed