How AI is rewriting the role of video

How AI is rewriting the role of video

In this Q&A interview, we speak with Jordan Cullis (pictured) from Milestone Systems about the rising use of AI in video management, and the many ways it is having an effect on security, smart buildings and cities and many other aspects of modern life.

 

The AI News Blog: We’ve seen huge advances in computer vision and analytics recently. How do you see AI transforming CCTV and video systems beyond their traditional security roles?

Suggested response: We’re moving from cameras being passive observers to becoming active sources of intelligence. Basically, AI enables video systems to detect, interpret and even anticipate what’s happening in a particular space, whether that’s improving safety, optimising operations, or enhancing customer experience. We have been talking about moving beyond security for a fair while now, and AI is now driving the next level of that innovation. Video data is incredibly rich. It can tell you how people move through a store, how space is used, or where queues form – all of which is information that can improve efficiency and design, not just security. Combining AI with video and analytics is a very potent blend of technologies and one which is driving us into a smarter future.

 

The AI News Blog: Integrating AI into legacy camera systems has always been tricky. How are organisations overcoming that?

Suggested response: The key is openness and flexibility. Modern video management platforms are increasingly modular, allowing analytics to run at the edge, in the cloud, or on-premises depending on the use case. We’re also seeing strong collaboration with AI partners that make it easier to deploy pre-trained models or build custom ones without deep coding expertise and thus lowering the barrier for AI adoption. Our recently announced vision language model as a service and a generative AI plug-in for our XProtect VMS are examples of this.

 

The AI News Blog: Many organisations still treat AI video analytics as a detection tool. What’s the next step in making it a true operational asset?

Suggested response: Context is everything. When analytics are embedded directly into live video streams, they can generate actionable insights for things like occupancy levels, dwell time, or flow mapping. Those insights can then efficiently inform staffing levels, energy use, or even retail layout. The shift is from triggering alarms to driving decisions. Video becomes part of an organisation’s operational intelligence, not just its security posture.

 

The AI News Blog: What are the main challenges when scaling AI across large camera networks?

Suggested response: Scale introduces complexity, there’s no secret with that – from compute demand, through to bandwidth, latency, false positives, and the constant need to retrain models. Edge processing helps because it reduces latency and bandwidth use, but maintaining model accuracy over time is critical. Then there’s the practical side, with different camera hardware, lighting, firmware versions and so forth – all of which affect how models perform. A flexible, open architecture helps overcome these challenges by letting organisations plug in the right tools for their specific environment.

 

The AI News Blog: Ethics and privacy are hot topics in AI. How should the industry be approaching these issues in video?

Suggested response: With transparency and accountability front and centre. The industry must commit to responsible AI, which means real human oversight, explainable outcomes, and compliance with the relevant privacy regulations in every market. Organisations should also have the freedom to decide what analytics they actually enable and then ensure they can audit those systems. That’s how we build public trust in AI video technologies.

 

The AI News Blog: We’ve heard about an initiative called Project Hafnia that aims to make AI model training more accessible. What’s that about?

Suggested response: It’s a platform that provides developers and partners the ability to train vision language models – VLM’s – responsibly and at scale. The idea is to provide access to high-quality, annotated video datasets and tools that make model fine-tuning more efficient, while maintaining traceability and compliance. By using platforms like NVIDIA Cosmos Curator, we can accelerate innovation in computer vision while keeping a strong ethical framework in place.

Just recently, we announced the upcoming launch of the Hafnia VLM as a Service (VLMaaS) and a generative AI plug-in for our XProtect VMS, both developed in collaboration with NVIDIA. The plug-in provides users with the power of advanced, large-scale AI models making for example video review and alarm handling a lot easier. Instead of operators spending hours sifting through footage, the tool automatically summarises clips, validates events to cut down false alarms, and creates quick incident reports.

 

The AI News Blog: Metadata seems to be a big focus now — can you explain its role?

Suggested response: Metadata is the backbone of scalable analytics. Rather than transmitting full video streams, AI can extract metadata — like object counts, motion paths, or attributes — which can then feed dashboards or trigger workflows. It’s much lighter on networks and storage, but still provides rich insights for trend analysis and long-term optimisation.

 

The AI News Blog: Finally, where do you see AI-driven video heading in the next five years?

Suggested response: We’re heading toward predictive, integrated systems. AI-driven video technology will not only alert operators to an incident but will also trigger automated responses such as adjusting lighting, dispatching maintenance, or rerouting foot traffic. It will also connect seamlessly with broader enterprise systems, and ultimately transforming video into a sensor network that helps cities, buildings and businesses operate more safely and efficiently.