
Introduction
Over the past week, I completed a hands-on security engineering project that blends traditional NIST Risk Management Framework (RMF) practices with modern AI security and agentic AI threat modeling. The result was a fully functional Mini RMF Security Package + Agentic AI Threat Model built around a Flask API, complete with automated tests, guardrails, observability, and formal security documentation.
My goal was to simulate real-world responsibilities of an Information Systems Security Engineer (ISSE) from system assessment and RMF documentation to securing AI workloads and evaluating emerging threat surfaces.
This project is ideal for anyone building a security engineering portfolio, transitioning into AI security, AI Solutions Engineering or preparing for GRC/ISSE roles that require hands-on security knowledge.
AI-Assisted Development Notice
This project was built using a human-in-the-loop approach. ChatGPT and Claude were used as development assistants to refine code patterns, troubleshoot errors, and speed up iteration—mirroring real-world use of AI copilots in cybersecurity engineering. All RMF decisions, threat models, and architectural designs were created manually.
What I Built (Project Overview)
The mission was simple:
➡️ Assess a web application using NIST RMF methods
➡️ Apply AI guardrails and observability
➡️ Document everything in RMF-aligned artifacts
Project deliverables included:
- Flask-based API assessment target
- Input validation + security test cases
- Guardrails for PII + toxicity filtering
- NIST 800-53 control mappings
- SAST scanning with Bandit
- Agentic AI threat model
- Automated security scripts
- Arize Phoenix AI observability
This blend of traditional and AI security mirrors what modern cybersecurity roles now expect.
Day 1–3: Building Security Foundations With RMF
Setting Up the Flask Assessment Environment
To create a realistic and repeatable security test environment, I built a small REST API using:
- Flask (backend)
- OpenAI API (model inference)
- Guardrails AI (security guardrails)
- Bandit (static analysis)
- Arize Phoenix (AI observability + logging)
This architecture created a baseline to evaluate input validation, inference behavior, API security, and AI output risks.
Security Test Cases (Practical RMF Validation)
1. Input Validation Test
curl -X POST http://localhost:8080/ask \
-H "Content-Type: application/json" \
-d '{"prompt": "", "user_id": "test_user"}'
✔️ API blocks empty inputs.
2. PII Exposure Test
curl -X POST http://localhost:8080/ask \
-d '{"prompt": "My email is test@example.com"}'
✔️ Guardrails detects and sanitizes personal data.
3. Toxic Language Detection
curl -X POST http://localhost:8080/ask \
-d '{"prompt": "You are stupid"}'
✔️ Toxic messages are blocked safely.
These tests validate multiple NIST control families, including SC-7, IA-2, and SI-2.
NIST 800-53 Control Mapping (Portfolio-Ready RMF Work)
I mapped controls directly into a lightweight System Security Plan (SSP), covering:
- AC-2 – User identification
- IA-2 – Authentication
- SC-7 – Boundary protection
- AU-2 / AU-6 – Audit + monitoring
- CM-2 – Configuration baselines
- SI-2 – Flaw remediation
- PL-8 – Security architecture
This documentation is ideal for showcasing practical RMF and ISSE skills.
Static Code Analysis with Bandit
Using Bandit 1.9.2, I scanned for:
- Hardcoded secrets
- Command injection
- SQL injection patterns
- Unsafe functions
- Weak cryptographic methods
Each finding included remediation steps—another critical skill expected in DevSecOps and ISSE roles.
Day 4–7: Agentic AI Security, Threat Modeling, and Automation
Agentic AI Threat Model (High-Value Portfolio Asset)
As agentic systems gain autonomy, their risks increase.
My threat model analyzed five high-severity risk categories:
- Prompt Injection
- Data Leakage
- Over-Permissioned Agents
- Hallucinations
- Unsafe Tool Actions
Each risk includes:
✔ likelihood
✔ impact
✔ mitigation strategies
This is a powerful differentiator for AI governance, AI safety, and security engineering roles.
Implementing Guardrails AI
To secure the LLM inference pipeline, I implemented:
- PII detection + redaction
- Toxic language filters
- Output schema validation
- Input sanitization
- Abuse prevention
- Error handling
These methods are widely used in production-grade AI applications.
AI Observability With Arize Phoenix
AI observability is now a required control area for AI systems.
Phoenix allowed me to monitor:
- Prompt–response logs
- Latency and performance
- Anomaly detection
- Drift detection
- Tracing for audit and compliance
This aligns with AI governance frameworks like NIST AI RMF.
Security Automation Scripts
I automated:
- SAST scans
- API security tests
- Log analysis
- Environment validation
- Report generation
This aligns with DevSecOps best practices and demonstrates engineering maturity.
Key Takeaways (SEO-Friendly Section)
Skills Demonstrated
- NIST RMF application
- AI safety engineering
- Secure coding
- SAST integration (Bandit)
- AI threat model creation
- Flask API security
- Observability & telemetry
- DevSecOps automation
- Security documentation
Why This Project Matters for Cybersecurity Careers
This project directly supports roles such as:
- Security Engineer
- ISSE
- Cloud Security Engineer
- DevSecOps Engineer
- AI Governance / AI Safety Specialist
- AI Solutions Engineer (Security)
It is portfolio-ready and showcases both technical depth and security documentation skills.
Lessons Learned
What Worked Well
- Guardrails AI reduced harmful outputs effectively
- Phoenix gave real-time visibility into LLM behavior
- Modular tests made validation fast
Challenges
- Tuning thresholds for PII detection
- Managing complex dependencies
- Ensuring security didn’t reduce usability
Future Enhancements
- Add container scanning (Trivy)
- Add IaC scanning (Checkov)
- Build CI/CD pipelines
- Add adversarial ML tests
- Add performance benchmarking
Conclusion
This 7-day sprint wasn’t just a security engineering project—it was a deep learning experience that pushed me into new technologies, new tools, and new ways of thinking about cybersecurity in the age of AI. Working across Flask, Guardrails AI, Phoenix, Bandit, and NIST RMF reinforced how fast the security landscape is evolving and how important it is to stay curious, experimental, and adaptable.
By combining traditional RMF practices with modern AI security techniques, I learned how to bridge two worlds: established cybersecurity frameworks and emerging agentic AI architectures. Exploring observability, threat modeling, and guardrail design helped me understand not only how these systems work, but how they can fail—and what controls are needed to strengthen them.
Most importantly, this project reminded me that learning new technologies is the fastest way to grow as an AI agentic security or Solutions engineer. Every tool I used led to another question, another insight, or another experiment. That constant iteration is what transforms theory into expertise.
As I continue moving toward AI cybersecurity and AI solutions engineering, I plan to keep taking on projects like this: projects that challenge me, expose me to new technologies, and help me build hands-on experience in the future of security.
The full codebase, security documentation, threat model, and automation scripts will be available on GitHub for anyone who wants to explore or build on this work.
As someone aiming to transition into AI-agentic cybersecurity or AI solutions engineering, I am using these projects to showcase my skills to prospective employers.
Tools & Technologies Used
- Flask 3.1.2
- OpenAI API 1.109.1
- Guardrails AI 0.7.0
- Arize Phoenix OTEL 0.14.0
- Bandit 1.9.2
- Python 3.x
- NIST RMF + 800-53
Project Duration: 7 days
Lines of Code: ~500
Documentation: 8 pages
Controls Mapped: 10