Artificial Intelligence "At the Edge"

The Artificial Intelligence revolution is well underway. We actively use AI via services like ChatGPT, DeepL natural language translation services and the “Assistants” in the latest phones, smart speakers and photo editing services. We are also passive users when we interact with many entities via phone or online services, with AI being used at least to control and filter the first stage of interactions.

Beyond this type of relatively simple labour-saving tasks, there are already serious studies underway as to how AI can assist with important topics such as analysing medical scans to improve detection of anomalies, or maintaining security by analysing CCTV feeds.

Traditional AI

Most of these services rely on the model of an end device with diverse input methods (voice, keyboard, camera, file transfer), which then sends the request to a server farm somewhere in the cloud where the AI processing is done. This has the clear advantage that you can throw almost unlimited computing power at the problem, and if relevant, make widescale use of the search function to produce answers and output.

The results obtainable can be impressive – well written essays, photorealistic images, natural translations and much more. However, this approach does come with several drawbacks.

Issues with AI solutions

Firstly, the server farms used to run all purpose AI engines are incredibly expensive to run. They use large numbers of racks of expensive dedicated processors, which can cost $10,000 or more each for the latest generation devices. Such data centres are massively power hungry, and raise questions about the environmental impact of an ever-increasing use of AI. Various credible public domain estimates of the cost of running the ChatGPT service suggest it could be not far off $1Million/day, although these are hard to verify.

This pushes users to rely on the services of third party providers, adding to the complexity of an AI infrastructure and a dependency not totally within the user’s control. The likes of Google and Amazon can have dedicated infrastructure to support their systems, but that is clearly not the case for everyone.

Networks and Latency

Such an architecture requires reliable and fast connections between whatever the input device is, and the server infrastructure in the cloud.

It also introduces a latency in the system that could be unpredictable depending on the demands on the service by other users, and the type of connection available to the input device.

This may not be an issue in the case where a user at a desktop is accessing AI services via a fast corporate internet connection. However, for an IoT type solution, where a number of sensor devices are gathering data that might be usefully analysed by AI, such an architecture risks being complicated and unwieldy. The benefits of using AI may be outweighed by the drawbacks.

AI “at the Edge”

Fortunately, there are many solutions where AI can be of value which do not need the power of a full-scale general-purpose AI engine. In such cases, where there is a controlled problem domain, a relatively small device may be able to run a limited AI inference engine.

In such a case, it can make sense to carry out the AI on the end device, which is what is known as “AI at the edge”. Even though under such an architecture, the AI inference engine is running on a small device, it can perform its task faster, as there is no need for the latency inherent in communicating input data to a remote server, and waiting for a response.

Reducing Power Consumption

Such a solution can also aid power consumption, as there is less need for communication. Particularly for wirelessly connected devices, where the radio is often the most power-hungry part of the system, minimising radio traffic can be key to enabling battery powered devices with a reasonable level of autonomy.

In such a scenario, it is important to distinguish between the AI model training, which inevitably will have to be carried out on some kind of larger computing device, and the Inference engine, which is the output of the training process. This latter part can be a relatively small component, if it is focussed on a limited problem domain.

To give a concrete (and real world) example, one of our customers wanted to implement a people counting device using an IR sensor, in order to manage HVAC for comfort and efficiency. A traditional solution might involve writing some custom code to analyse sensor data and tune it manually to give accurate results. With an AI solution, a model can be created and trained using sensor input, removing the need to create custom code.

Other examples might include a video doorbell, which you only want to switch on when a user approaches. An AI model could potentially make a better distinction between a person approaching the camera and other random background movement that might be picked up on the sensor. This can in turn reduce power consumption and potentially make battery operation realistic.

An AI solution doesn’t have to be a binary choice between “AI at the edge” and “AI in the cloud”. Some initial processing of data at the front end can be used to identify clear or urgent cases, with other data sent onwards for cloud processing. For example, in an industrial control setting, a front-end AI might be configured to identify urgent imminent critical failures that might require shutting a machine down, whilst some other data might be sent up to the cloud for longer term analysis to aid process improvement.

Privacy Issues

Privacy is a further concern when considering where intelligence in a system should be placed. Medical wearables can provide valuable and even life saving diagnostic information. But not everyone would be happy to have medical data shared with cloud services, however secure they claim to be. Local processing using models trained offline can provide the best of both worlds.

It is notable that Apple takes a different approach than Google and others to this issue. Apple AI tends to run locally on the device, to maintain user privacy, whereas Google (and others) have little issue with taking user input and using it in the cloud. This arguably gives Google an advantage in terms of developing its AI engines, but at the cost of user privacy.

Device Characteristics

In practical terms, what kind of devices are required to carry out “AI at the edge”? Early simple IoT devices, released a decade ago, based on perhaps a Bluetooth device with an integrated microprocessor might have been ARM Cortex M0 processor running at 32MHz, with 256KB Flash memory and 32KB of RAM. Such a device could run a protocol stack and a simple application (to read some sensors, for example), but little more.

Next generation devices have an order of magnitude more capability, with M33 processors running at 300Mhz or more, large flash and RAM capacity and often dual core architectures so that the protocol stacks can run on a completely independent core to the main application core. Such devices are certainly capable of running simple AI inference engines. Such devices maintain the low power capability of traditional IOT solutions, running on ARM cores and with low power characteristics designed in from the start.

Specialist AI microprocessors and devices

If a more complex application is required, then a variety of dedicated AI oriented microprocessors exist. These can be anything from scaled down versions of Nvidia server chips to AI oriented DSP devices. There remains a trade off between performance and power consumption, but these devices can carry out AI functions whilst remaining resource-light in terms of power consumption. Often the trade-off is more that a particular device is designed to process a certain type of input – images, or voice commands or sensor analysis, rather than be able to carry out generic AI functions. However, for an IoT device, this is often a perfectly acceptable trade off. The first such device from Insight SIP is the ISP2554-HM, based on the Nordic Semiconductor nRF54H20. This multicore device offers a radio processor to run the BLE stack and associated applications, a powerful AI-capable M33 Cortex application processor running at up to 260Mhz, as well as independent security and peripheral processors. Thus the core application processor can be dedicated to AI tasks without having to handle real time interrupts from the radio or connected devices.

Future Evolution of AI and IoT

AI is still a fast-developing technology. AI models are becoming more complex and sophisticated, and the underlying hardware is also developing fast. It has also become a domain of geopolitical competition, with the Chinese company DeepSeek making headlines through its claim to have created a much more efficient AI model requiring significantly fewer resources in terms of computer hardware.

At the same time, the business models behind AI are yet to be clarified. The technology is currently in the “race” stage, where major technology firms are less interested in profit than establishing leadership. Offering services like ChapGPT for free (at least at some level) aids model development by engaging a massive user base. Clearly that isn’t sustainable long term. Carrying out AI efficiently and cost effectively will inevitably become a bigger focus, and the AI at the edge model will be a major part of that.

Nick Wood, Sales and Marketing Director, Insight SIP

First published in Components in Electronics June 2025

Artificial Intelligence "At the Edge"

CONTACT US

OFFICE

NEWSLETTER

Artificial Intelligence "At the Edge"

CONTACT US

OFFICE

NEWSLETTER

Privacy policy