How GDS Used Data to Improve Content and User Journeys on GOV.UK
GOV.UK faced challenges with managing over 400,000 content pieces and helping users navigate effectively through their vast information landscape. The Government Digital Service (GDS) Data Science team implemented an innovative solution using machine learning and network analysis to automatically identify related content and improve navigation paths. This resulted in enhanced user journeys for more than 10,000 daily users and a significant reduction in internal search usage, demonstrating the power of data-driven content management.
Project Overview
May - August 2019
GDS Data Science Team
Machine Learning, Network Analysis, Google Universal Sentence Encoder
Background
Content Scale
- 400,000+ unique content pieces
- 700 new pieces published weekly
- Only 2% had related links
Challenges
- Duplicate content
- Navigation difficulties
- Limited content relationships
Technical Approach
Semantic Vectors
- Used Google's universal sentence encoder
- Identified similar content
- Supported taxonomy decisions
Network Analysis
- Created Network Data Pipeline
- Analyzed user journeys
- Compared functional vs structural networks
Machine Learning
- Implemented node2vec algorithm
- Generated automated related links
- Conducted A/B testing
Results and Impact
Improvements
- Enhanced user journeys for 10,000+ users daily
- 20-40% reduction in internal search usage
- Automated link generation every 3 weeks
Quality Controls
- Confidence thresholds for generated links
- Content designer review for top 200 pages
- Exclusion rules for inappropriate linking
Key Learnings
- Automated links can effectively supplement manual curation
- Data-driven approach improved user navigation
- Regular updates maintain link relevance
- Balance needed between automation and human oversight