< Project Overview >
CCTV networks are ubiquitous in our urban areas but can also be repurposed as sensors that provide valuable insights for other purposes, such as counting pedestrians and vehicles. These processes can be automated through the use of machine learning and can be designed in such a way to safeguard privacy by limiting the transfer of raw footage and doing more of the processing next to the camera before discarding all but the count data.
REBUSCOV created a toolkit that simplifies and assists with the creation of privacy-conscious CCTV processing, through a collaboration with Almere and Street Systems. A comprehensive training dataset for the models was created with a reduced resolution, such that it prevents identification of almost all individuals. These training datasets are essential in the creation of accurate machine learning models but can be difficult to obtain, particularly for footage captured in the United Kingdom.
This project focused on addressing barriers of entry and the economic barriers to operation, upgrading, and switching between technologies. This was to be achieved by:
1. Providing best-practice guidance for mesh deployments or fog computing deployments, to enable near-edge processing.
2. Creating an open training dataset and detection model that could be the foundation for further CCTV analytics.
3. Interacting with CCTV operators to understand, document and where possible provide solutions to barriers concerning data sharing, repurposing of existing assets, and deployments.
What was done?
A data protection impact assessment was prepared to understand the risks and appropriate mitigations before a large training dataset was produced. This leveraged highway camera images made available under Open Government Licence in Tyne and Wear, archived by the Urban Observatory over a period of more than twelve months, allowing a broad range of weather conditions and lighting to be considered across more than 250 cameras. The dataset contains 9,500 images. A multi-object detection model using machine learning was trained using this data.
A test deployment of a mesh network was established in the centre of Newcastle using receivers on a tower block and a compute facility on the roof capable of far more involved and detailed processing on multiple cameras than would be achievable within the constraints of installing equipment on lamp columns. The learnings from this were summarised in a report.
The project partner, Almere, also engaged with a number of other local authorities including a London Borough, rail operators, contractors responsible for managing relevant assets such as SSE, and operators of CCTV networks including Newcastle City Council and the Tyne and Wear Urban Traffic Management and Control Centre.
The value of repurposing existing CCTV assets and new deployments on existing street furniture has been demonstrated through an operational network in Newcastle upon Tyne that proved invaluable throughout the SARS-Covid-19 pandemic.
This is enabled through a machine learning model, based on the large and comprehensive training dataset also created, which achieved high levels of accuracy in Tyne and Wear and good transferability through tests on footage from Glasgow. Because the training data is based on imagery from the UK it outperforms other training datasets available, often recorded in China where the street scenes can appear quite different. The project partner has been able to lower the cost of similar deployments through learnings from the tests conducted in this project.
Deliverables and other tangible outputs
- Sector survey/analysis: Commentary on a number of interviews held with relevant stakeholders about barriers to CCTV-based analysis and statistics.
- Hardware demonstrator: Operational counting system in central Newcastle used by Newcastle City Council and Newcastle University’s Urban Observatory as part of its open data statistics offerings.
- Training material: Brief details of challenges and potential solutions resulting from the demonstrator mesh deployment.
- Training material and template legal contract/agreement: Detailed instructions for deploying an equivalent of HowBusyIsToon elsewhere, including a pro-forma data protection impact assessment for this type of CCTV-based analysis.
- Software product/source code: Open-source code and trained machine learning model that can be used to detect vehicles and pedestrians in real-time on CCTV feeds, with results made available through APIs for integration into other system.
- General public or sector stakeholder engagement activities: Data obtained, and networks established through this work have provided valuable data to the Department for Transport, 10 Downing St, and Joint Biosecurity Centre for understanding mobility during the SARS-CoV-19 pandemic.
- Processes or services created/enhanced: Additional funding was obtained, and further work enabled by the interviews conducted by Almere has resulted in pro-forma data protection impact assessments and guidance for local authorities deploying CCTV-based counting networks. This is now being operationalised by North Tyneside Council and the Essex and Herts Digital Innovation Zone for forthcoming deployments.
- IP created: The machine learning model and training dataset as an asset has been adopted by the Office for National Statistics and is in use as part of the Faster Economic Indicators pipeline.
- Jobs created/safeguarded: Expertise further developed at Newcastle University as part of this project has resulted in further funding wins and involvement in projects such as TRACK, the national COVID-19 risk model for public transport, which is a £1.7M project that employs two members of staff at the university.
- The operational network deployed in Newcastle is being further extended to include additional areas of the city, having demonstrated the value of and appetite for the data in evaluating the efficacy and consequences of Low Traffic Neighbours and other urban interventions.
- A bid was recently submitted to extend the CCTV analysis to provide data specifically targeting parameterisation of spread models for SARS-CoV-19, filling an important gap in existing datasets on human behaviour and only possible through observations in public places that provide the type of privacy safeguards used in REBUSCOV.
The capabilities provided through this project were incredibly timely for the SARS-CoV-19 pandemic, ensuring we had a baseline of counting before the restrictions hit. The unique nature of the data collected in Newcastle has meant both Almere and Newcastle University are recognised nationally for their expertise in this space, resulting in our involvement in further projects.
CCTV-based analytics is becoming increasingly controversial because it describes a wide range of activities, including in some places facial recognition by law enforcement and the private sector. A communications strategy that gave more thought to how to describe this work publicly including the safeguards it has in place and how it differs from other activities could have been helpful as some objections were received.
The training dataset created is incredibly large and might benefit from some form of approval mechanism for those requesting access, because it could also be used for more controversial purposes like mass surveillance. A good data sharing platform to host the data and approve requests would have been useful.
What has Pitch-In done for you?
The data and software developed through this project has created a capability that now underpins critical parts of the UK’s national statistics and response to the pandemic.