Scenario 03: Legacy system data-centric security retrofit
Overview
NATO organizations operate numerous legacy applications that were not designed with data-centric security principles. These systems contain mixed-sensitivity information (from unclassified to highly classified) but lack granular access controls. The challenge is to retrofit data-centric security onto these systems without rewriting the applications, enabling dynamic content filtering based on user permissions.
Problem statement
Legacy systems display all information to authenticated users regardless of their specific clearances or need-to-know. This violates data-centric security principles and creates operational and compliance risks. The system must become contextually aware of data sensitivity and apply appropriate labels/metadata to control what each user can see.
Actors
Legacy Application
- Type: Existing NATO information system (e.g., intelligence database, operational planning tool, logistics system)
- Architecture: Monolithic application, not designed for DCS
- Data Model: Mixed sensitivity data in same database/interface
- Authentication: Existing user authentication (LDAP, AD, PKI)
- Constraints: Cannot be rewritten or significantly modified
Users
- Clearance Levels: Range from unclassified to CTS (Cosmic Top Secret)
- Special Access Programs: Various SAPs and compartments
- Nationality: Multiple NATO nations
- Roles: Analysts, operators, commanders, administrators
DCS retrofit layer
- Role: Intermediary between users and legacy application
- Capabilities: Content analysis, labeling, policy enforcement, filtering
- Integration: Must work with existing authentication and data stores
Scenario flow
Phase 1: User Access Request
Context: User authenticates to legacy system through normal process.
Action: User requests to view information (e.g., opens intelligence report, queries database, views operational plan).
Current Behavior: System displays all information user has database access to, regardless of sensitivity.
Desired Behavior: System analyzes content, applies labels, and filters display based on user's clearance/permissions.
Phase 2: Content Analysis and Labeling
Context: Legacy system contains unlabeled data of varying sensitivity.
Challenge: System must automatically: - Identify sensitive information within documents/records - Determine appropriate classification level - Identify required special access programs - Apply metadata labels to content
Examples: - Intelligence report mentioning specific operation → Label: CTS + Operation WALL - Logistics data with unit locations → Label: NS - Administrative information → Label: Unclassified - Mixed document with multiple sensitivity levels → Label: Highest level + all required SAPs
Phase 3: Policy Evaluation
Context: Content is now labeled with sensitivity metadata.
Action: System evaluates user's attributes against content labels: - User clearance level vs content classification - User SAP memberships vs content SAP requirements - User nationality vs content releasability - User role vs content need-to-know
Decision: Grant full access, partial access (redacted), or deny access.
Phase 4: Dynamic Content Filtering
Context: Policy evaluation determines what user can see.
Action: System dynamically filters content: - Full Access: Display complete content - Partial Access: Redact sensitive portions, display remainder - Deny Access: Hide content entirely or show "access denied" message
User Experience: Seamless - user sees only what they're authorized to see, without knowing what's been filtered.
Phase 5: Audit and Compliance
Context: All access attempts and filtering decisions must be logged.
Action: System records: - User identity and attributes - Content accessed (or attempted) - Labels applied to content - Policy decision (grant/partial/deny) - Timestamp and context
Purpose: Compliance, security monitoring, incident investigation.
Operational constraints
- No Application Rewrite: Legacy application code cannot be significantly modified
- Existing Authentication: Must integrate with current authentication systems
- Performance: Minimal latency impact (suitable for interactive use)
- Transparency: Users should not notice the DCS layer (seamless experience)
- Accuracy: Content labeling must be very highly accurate (suitable for operational security)
- Scalability: Must handle existing user load and data volumes
- Backward Compatibility: System must continue to function if DCS layer fails
- Data Variety: Handle multiple content types (text, images, structured data, documents)
Technical challenges
- Automatic Content Classification: How to accurately identify sensitive information in unlabeled data?
- Context Understanding: How to determine appropriate labels based on content context?
- Integration Architecture: Where does DCS layer sit (proxy, middleware, database layer)?
- Performance: How to analyze and label content in real-time without impacting user experience?
- Mixed Sensitivity Content: How to handle documents with multiple classification levels?
- Granularity: What level of granularity (document, paragraph, sentence, field)?
- Label Persistence: Should labels be stored or computed on-demand?
- Policy Management: How to define and update labeling and access policies?
- False Positives/Negatives: How to handle misclassification?
- Legacy Data Formats: How to parse and analyze diverse legacy data formats?
Acceptance criteria
AC1: Automatic Content Labeling
- System automatically analyzes unlabeled content
- Identifies classification level (Unclassified, NS, CTS, etc.)
- Identifies required SAPs and compartments
- Applies metadata labels to content
- Labeling accuracy is very high (suitable for operational use)
- Labeling completes with low latency (suitable for interactive use)
AC2: User Attribute Integration
- System retrieves user clearance level from existing authentication
- System retrieves user SAP memberships
- System retrieves user nationality and role
- Integration works with LDAP, Active Directory, and PKI systems
- User attributes cached for fast lookup
AC3: Policy-Based Access Control
- System evaluates user attributes against content labels
- Enforces clearance level requirements
- Enforces SAP requirements
- Enforces nationality-based releasability
- Enforces need-to-know based on role
- Policy evaluation completes with very low latency
AC4: Dynamic Content Filtering
- Full access: Display complete content when authorized
- Partial access: Redact sensitive portions, display remainder
- Deny access: Hide content or show access denied message
- Filtering preserves document structure and readability
- User cannot detect what has been filtered
- Filtering works for text, structured data, and documents
AC5: Granular Filtering
- Filter at document level (show/hide entire document)
- Filter at section level (show/hide sections within document)
- Filter at paragraph level (redact specific paragraphs)
- Filter at field level (redact database fields)
- Maintain context and readability after filtering
AC6: Multiple Content Types
- Handle text documents (Word, PDF, plain text)
- Handle structured data (database records, XML, JSON)
- Handle images (redact sensitive portions)
- Handle mixed media documents
- Consistent labeling across content types
AC7: Seamless Integration
- No changes required to legacy application code
- Works with existing authentication mechanisms
- Transparent to end users (no new login, no UI changes)
- Graceful degradation if DCS layer fails (fail-open or fail-closed based on policy)
- Compatible with existing backup and recovery procedures
AC8: Performance
- Content labeling completes with low latency
- Policy evaluation completes with very low latency
- User attribute lookup is fast
- Total added latency is minimal and acceptable for interactive use
- No significant degradation of legacy application performance
- Scales to existing user load without performance issues
AC9: Comprehensive Audit Trail
- Log all access attempts (successful and denied)
- Log content labels applied
- Log policy decisions (grant/partial/deny)
- Log user attributes at time of access
- Log redactions performed
- Logs tamper-proof and retained per compliance requirements
- Audit query interface for security monitoring
AC10: Policy Management
- Administrators can define labeling rules
- Administrators can define access policies
- Policy updates take effect immediately (or within X minutes)
- Policy versioning and rollback capability
- Policy testing/simulation before deployment
- Policy conflicts detected and reported
AC11: Accuracy and Reliability
- Content classification accuracy is very high (suitable for operational security)
- False positive rate is low (over-classification minimized)
- False negative rate is very low (under-classification is critical security risk)
- Manual override capability for misclassified content
- Feedback mechanism to improve classification over time
AC12: Mixed Sensitivity Handling
- Correctly identify multiple classification levels in single document
- Apply highest classification level to document metadata
- Enable granular filtering within mixed documents
- Preserve document coherence after filtering
- Handle nested classifications (classified section within unclassified document)
Success metrics
- Classification Accuracy: Very high correct labeling rate on test dataset
- Performance Impact: Minimal added latency (acceptable for interactive use)
- User Satisfaction: Majority of users report seamless experience
- Security Improvement: All sensitive content protected by access controls
- Audit Completeness: All access attempts logged
- False Negative Rate: Very low (critical - under-classification is security risk)
- Deployment Time: Reasonable timeframe to retrofit existing system
- Maintenance Overhead: Minimal increase in system administration effort
Example use cases
Use case 1: Intelligence database
Legacy System: Intelligence database with reports from multiple sources and classification levels mixed together.
Current Problem: Analyst with NS clearance sees CTS information they shouldn't access.
DCS Solution: - System analyzes each report, identifies CTS content - Labels reports with appropriate classification - Analyst queries database, sees only NS and below reports - CTS reports hidden from analyst's view - Audit log records analyst's query and filtered results
Use case 2: Operational planning tool
Legacy System: Planning tool with operational plans containing mixed sensitivity information.
Current Problem: User without WALL SAP sees operation details they shouldn't know.
DCS Solution: - System analyzes plan, identifies WALL SAP content - Labels sections requiring WALL access - User opens plan, sees general information - WALL SAP sections redacted with "[REDACTED - WALL SAP REQUIRED]" - User unaware of what specific information is hidden - Audit log records partial access
Use case 3: Logistics system
Legacy System: Logistics database with unit locations, supply levels, movement plans.
Current Problem: Contractor with limited clearance sees sensitive unit locations.
DCS Solution: - System identifies sensitive fields (unit locations, movement plans) - Labels fields with appropriate classification - Contractor queries supply levels (authorized) - System shows supply data, redacts location fields - Contractor sees "[REDACTED]" for location fields - Audit log records field-level filtering
Out of scope
- Rewriting legacy applications
- Replacing existing authentication systems
- Real-time streaming data classification
- Classification of data in motion (network traffic)
- Encryption of data at rest (separate concern)
- Cross-domain solutions (separate scenario)
- AI/ML model training (assume pre-trained models available)
Related scenarios
- Scenario 01: Strategic sharing -- DCS for new data being shared
- Scenario 02: Tactical operations -- DCS in disconnected environments
- Scenario 03: This scenario -- DCS retrofit for legacy systems
For new systems, design DCS in from the start (Scenarios 01 and 02).
Key assumptions
- Content is Analyzable: Legacy data is in formats that can be parsed and analyzed
- Labeling Rules Exist: Organization has defined rules for classifying content
- User Attributes Available: User clearances and permissions are in accessible directory
- Performance Acceptable: Minimal added latency is acceptable to users for security benefits
- Accuracy Sufficient: Very high classification accuracy meets security requirements
- Manual Review Available: Misclassified content can be manually corrected
Risk considerations
Security Risks: - Under-classification (false negative) exposes sensitive data - Over-classification (false positive) reduces operational effectiveness - DCS layer bypass could expose all data - Audit log tampering could hide security incidents
Operational Risks: - Performance degradation impacts user productivity - Misclassification causes user frustration - System complexity increases maintenance burden - Integration failures could break legacy application
Mitigation Strategies: - Bias toward over-classification (fail secure) - Extensive testing before deployment - Manual review process for critical content - Graceful degradation if DCS layer fails - Comprehensive monitoring and alerting
Retrofitting DCS onto existing systems that weren't designed for it. For new systems, design DCS in from the start (Scenarios 01 and 02).