This technology briefing covers the full lifecycle of dataset generation, AI model creation and deployment and inferencing on the edge. Emrah Gultekin, CEO of Chooch AI, presents case studies and a walkthrough of the AI platform.
Thank you for joining us today. Today we’re going to be talking a little bit about AI models on the edge and why that’s important. And also I’ll be taking you through a bit about the AI life cycle, trying to demystify some of it, because it is computers and there’s a lot of chatter out there about AI, so just trying to demystify and making it more explainable for everybody.
Let’s begin with who we are. We were a company based out of the Bay Area and we are copying human visual intelligence into machines. That’s the fundamental goal and target we have as a company. And we’re part of a larger ecosystem of AI players, of OEM players and distributors of services and so forth. We’re all in this together trying to copy human intelligence into these machines, and we’re part of the vision part of that. And what that really means is fast AI training and deployment, both in the cloud and on the edge. And so we’ll go through some of that life cycle as well as we move forward.
What do we do as Chooch AI? We detect things. I mean, that’s really what this is about. It’s about visual detection and we’re talking about detecting and tracking objects, images, faces, actions, conditions, and so forth. And it’s a platform, so it’s an open platform that anybody can start using. And it’s used currently for infrastructure monitoring and surveillance. We’re using it in healthcare, industrial safety. We do a lot of geospatial analysis as well, retail and media. It’s kind of ubiquitous in that sense because it is a platform. And so we enable people in those verticals to be able to deploy AI very, very quickly. And it has been an issue in our ecosystem. Friction of getting these deployments out has been an issue, so that’s one of our goals as a company as well.
What are we going to be talking about today? Why edge AI, why that’s really crucial, especially for visual information. The AI model life cycle, how these are developed and how they are deployed. We’ll talk about our platform, what we do as a company and we’ll also talk about how we get customers up and running very quickly with our platform. And obviously, next steps for anybody either using pre-trained or custom models and we’ll go through that as well a bit on the system.
Why is edge so important, specifically for visual tasks? And there are a number of reasons for these, but these are the top five. First of all, the cost of video streaming to the cloud is very, very expensive. It’s very expensive. It doesn’t justify the benefit today. That’s the number one reason. Number two is network load. It’s very heavy on the network, streaming of these videos constantly. Imagine doing 24/7 surveillance with a thousand cameras out there and you’re constantly streaming that to the cloud. The network load is very, very heavy.
The latency is not great either. There’s a lot of latency on the cloud because the information has to go to the cloud, it has to be inferenced there, and it has to send back a prediction. And imagine doing that for every frame or every few frames, or even every 15 or 20 frames. Te latency is really there. And also maintaining privacy is a major issue, especially if you’re handling PII. Most customers want this information to be stored and inferenced on prem, so maintaining that privacy is very, very important, especially in surveillance and healthcare.
The last bit of that is a lot of places don’t have bandwidth and these things need to work in disconnected environments, or even if they do have bandwidth, they may be disconnected for a split second. And if you’re doing something critical, you can’t have that because it’s very, very dangerous to have that in a critical situation. The edge works on disconnected environments, too.
Move forward here. A little bit about the AI life cycle. And so this is crucial to understand, to demystify some of the issues that we see in the AI ecosystem in terms of how it’s presented. We’re talking about it constantly. It’s on the news constantly. It’s in our feeds. AI is on our feeds. How do we make it more explainable? It’s important to understand how this whole cycle works. And this is a case for pre-trained models or custom models that you
build, inference engines, whatever you’re really doing.
It’s really in three different buckets. The first bucket is data set generation. And this is where you collect data, you’re annotating data, you’re augmenting data, you may be generating synthetic data. What you’re doing here is you’re creating data to potentiall y teach AI what that data means. And so that data set generation is a universe unto itself. There are many companies doing that, and it’s very important to do that properly. Whether you’re doing it with pre-trained models or whether you’re doing custom models, it really doesn’t matter. That data set generation is there. The customer doesn’t see this usually, especially if they’re just looking at the inference, but it’s very crucial for the model to be developed, that data set generation piece.
The next phase of that is model training. Now you have a dataset which has images and videos and so forth which are annotated, you want to be able to train that model. You want to be able to generate a model out of it. You’re teaching the AI what you want it to see. And so that model training happens after dataset generation and you can use any of the deep learning frameworks out there. There are different neural nets, like the rez nets and so forth that you can use to train those models as well.
The trick here is the data set that you develop, the only way to really understand if it’s a good data set is to train the model. And so there’s this counterintuitive thing going on there. Once you train the model, you test it and then you deploy it into an inference engine. And that’s what we call inferencing over here. And that’s when the predictions are done. The client really sees the inferencing. When you see bounding boxes and all that, that’s influencing. You have new data coming in and then the AI predicts what that new data is based on the data it was trained with. And so that’s called inferencing and the inference engine.
And when we talk about edge AI, we’re talking about really putting the inference engine onto these devices. And part of those inference engines, part of the data that comes out of the inference engine goes back to data set generation. There’s this loop going on. And then the data set generation that comes back to model training and then inferencing and so forth. It’s really important to understand this. It’s not hocus pocus. It’s all technology and it’s all explainable to a certain degree.
The Chooch AI platform, it’s an end-to-end AI platform that does all of this, so from data set collection to training, to inferencing, and the reason it has to be that way is because of the friction to deploy these, whether you’re doing it on the cloud or the edge, to clients. It’s an end-to-end platform in that sense. You have the Chooch dashboard here, you’re collecting data, you train models, you deploy and manage them remotely, and you do API integration so forth. This is the main thing where you have your account. You can access your data sets. You can build data sets and so forth, build models, whatev
er you need to do.
And then you have the edge AI inference engine, which there’s an inference engine on that dashboard on the main cloud, which is mirrored into these devices. And this is where the predictions and models are managed as well. And then you have imaging devices or video streams coming into the inference engine. There’s a decoder on the inference engine and the imaging devices send that information through our TSP feed or it could be an API into the inference engine and then you get metadata being generated. That metadata being generated could be alerts, actions for objects, whatever the class titles are coming out is tied to external systems if necessary for clients or integrators that use other types of dashboards or other types of machine-to-machine tasking.
This is a high level overview of the Chooch platform, and we’ll go into some more detail here. This is an example of AI model fire detection. This is collected from videos and images which are public or private, depending on what type of use case that you have. And it’s annotated on the system. The model’s trained on the system, and then it’s inferenced on the system through either the cloud or the devices. And the purpose of this is to send alerts obviously when it sees a fire. You have normal sensors that sense fires, but this is an alternative way of doing that to augment that capability through fire and smoke detection as well, so you have visual cues, not just dumb sensors. This goes into the inference engine, these feeds, and then the devices constantly every two one hundredths of a second inferences what’s going on and then sends alerts to people who need to be in the know.
We’ll walk you through how this works on the system, how to get it up and running immediately. Basically you log into your Chooch account, you select the device, you select the camera, you select the model, you install it and to get the AI running for the initial setup. So we’ll do that right now. You go to chooch.ai. You can log into your account. You set up an account, log into it and basically you have a dashboard here. You’ve got files, you’ve got public models, you’ve got your own object detection models if you train image, facial and so forth. Here, we’re talking about devices on the edge, so go directly to devices. And here you see any type of devices that you’ve may have set up. This will be empty for you, but there’s the setup guide here too.
The first thing you want to do is create a device, right? That’s the first thing that you do, create a device. We call this the office device number one. You choose the type of device you’re using. And we have GPU devices here, PCs or Jetsons and soon we’ll also have the capability to put this on Intel CPUs as well and we’ll add that to the dashboard. And basically you can enter your API endpoint. You can put the location … it’s optional … a description. You create a device. That’s the first thing that you do. You’ve created office device one.
Now you want to add a stream to it, right? You need to add streams to it. add a stream. This is the entrance camera. And you put the RTSP feed into this as well. And then you add that camera, and boom, it’s here with the camera RTSP. Now, what you need to do is really add a model or add models to it. You go to add models. You may add more cameras to that device, so you first select the camera. It’s a camera-based model, so you need to select the cameras that you want that model to be running on.
You select your model, and you may be using public models, or you may have trained your own model as well, which you can do on the system. These are some of the public models we have, fire, handgun, rifle detection, aerial drone. I’m going to choose fall detection, human fall detection, and boom, it’s right there. It appears. And you can add multiple models obviously through this.
What do you do next? You go to your device. Everything that we did previously on the previous parts is all on the cloud, on your account, through Chooch AI. And the next thing that you do is now that you have a device, maybe the Jetson or the PCs, it’s Linux-based, so you need to pull the docker. It’s on Docker Hub, and you can just pull the Docker and starts installing it. And the next thing that you do is you have to put in the device ID, because now it’s going to connect your cloud account to your device account. You copy your device ID, you put it into here and boom, it’ll prompt this on your device. You connect it. And you’ll have this panel that appears, which is connected to your cloud account. Now, this is your edge AI dashboard. It’s separate from your cloud dashboard. Now this is on the device.
And now you want to see, what are the predictions? I’ve got a camera set up, I’ve got this device set up. And so you can go to predictions here and you can see realtime what’s going on for that particular camera. You see the class title, you see the score, which is the accuracy. You’ll have the model, if you have the camera, which camera it is. You’ll have the coordinates of what that thing has detected. You’ll see the path, what’s the path of it, which way it’s moving. You’ll have the date and you can also view the JSON response here, and you can connect the JSON response to anything else that you may be doing. Very basic setup, very basic JSON response that you can integrate into pretty much anything. We’ll take questions at the end. I see some questions coming through.
Basically what we’re really talking about here is getting these AI models from the cloud environment onto devices so that you can constantly do predictions so you can do constant inferencing without a load onto anything and keep your privacy at the same time. It’s available today on GPU Hardware with existing or new video systems as discussed. But in the next couple of weeks, we’ll be making an announcement on other CPU devices as well and we’ll be running on Windows as well, which is a major breakthrough for our ecosystem.
The benefits are obviously increased productivity, reduced risks and costs. And if you want to start a computer vision project, begin with a defined automation goal that you have. There are lots of things that you could do with these types of systems, but you really need to know what you want to start out with. And it’s best to start out with the pre-trained models, because that gives you some feel for how the system works and how you can integrate it into your own devices or your own platforms and so forth. Yeah. Thank you for that. And I think it’s time we can start taking some questions. Let’s see here.
Jeffrey Goldsmith:
Yeah. Thanks, Emrah. Yeah, I think the first question is about inferencing. Does inferencing include model validation against pre-labeled frames? Do you see the questions, Emrah?
Emrah Gultekin
I don’t see the questions for some reason.
Jeffrey Goldsmith:
Okay. The first question is, does inferencing include model validation against pre-labeled frames?
Emrah Gultekin
Yeah, that’s a really good question. The system itself on the cloud has … What happens is when you’re training a model, right, you have the training set and then you have the validation set. And the system defines that itself. It takes part of the data set that you’ve created and puts it in a training set and the rest goes into the validation set. The system, the machine generates some type of accuracy for that particular model, but that’s never enough, right? You need to be able to test it, especially if you’re doing something very new. You need to be able to test it manually before you deploy it.
Emrah Gultekin
There is a manual testing tool where you can upload images, and then you’re able to mark whether that’s correct or not, or if there are any missing results to get you precision and recall, which will give your F1 score. You don’t want to just deploy something out there without checking what the accuracy or the F1 score of that is. You want to get to a certain F1 score defined by you and the system generates that report as well. We didn’t go into how the cloud dashboard works, but this is before you deploy, even on a device, you need to be doing that on the cloud dashboard. But great question. Yes.
Jeffrey Goldsmith:
Yeah. Great. Thanks. Thank you, Emrah. We have about seven more questions. Let’s just go through them. Do you provide an environment for synthetic training data generation?
Emrah Gultekin
What we do in the background is we do data augmentation. The client doesn’t see that. The user doesn’t see that. Every image that you put in is multiplied by 18. And we do that because the data might be dirty, you might have different angles, you might have different lighting and so forth or different sizes of things, distance or a closeup and so forth. You need do the data augmentation. That is done automatically on the system. We do not have synthetic data. We do not do that ourselves, but we do data augmentation. But if you do upload data into the system, you can play around with it quite a bit, but we do not do synthetic data, no, at this point.
Jeffrey Goldsmith:
Here’s a related question. Do you support temporal deep learning vision structures, like RNN, Burr, et cetera?
Emrah Gultekin
Yeah, so we have deployed the temporal vision structures for human movement, human pose estimation and so forth. We do support that as a model. In terms of stitching that together for RNN, we do have some temporal models that we are working on to make that part of the larger ecosystem that we have. But in terms of detecting and stitching together different temporal modes, you have to do that manually today on the system.
Jeffrey Goldsmith:
Yeah, but we’re continually adding to it so that as customers demand that, we will continue to add.
Emrah Gultekin
The point here is to be able to … All of these things can be done manually on the system, but to be able to do it without friction means that the users, especially enterprise users, don’t want to see all that detail on the system, but it can be done manually, yes.
Jeffrey Goldsmith:
Yeah. Here’s another question. Do you support any cameras that have built-in inference acceleration instead of separate edge compute boxes? It’s a question about cameras with built-in.
Emrah Gultekin
Yeah. If the camera does have a GPU, and we’ve been talking to some of the providers on that, built-in cameras with GPUs, or even CPU inferencing, we’re able to deploy onto those cameras as well, but we’d have to see the specs of the camera. Yeah.
Jeffrey Goldsmith:
Here’s a much simpler, less tactical question, Emrah. Do you also provide training support on your platform?
Emrah Gultekin
Yeah. Well, we do the annotation work … Mainly we do the annotation work for our clients and the training of the models. We do provide that for enterprise clients. Usually, unless they already have an annotation partner or training partner, they want to do it all on one system. We do provide that support, yes.
Jeffrey Goldsmith:
Yeah. And here’s an ROI question. How do you look at the ROI justification of moving to the edge for an application that’s currently running in the cloud? How do we propose that?
Emrah Gultekin
Well, you would have to look at the cost of … The first thing that you do is you look at the cost of the cloud and the cost of the cloud, on the wild spectrum of things where if you were inferencing 24/7 into the cloud, you’d be paying between 20 or $30,000 a year for each camera. But that’s an extreme case where you need 24/7 surveillance of that particular stream. In this case, you’d be paying $500 a year licensing. There’s a huge ROI there. It’s not really about ROI. It’s whether they should do it or not do it. There aren’t that many companies actually streaming and inferencing for the cloud today.
Jeffrey Goldsmith:
It’s true. Okay. Here’s another one. Once models have been deployed to edge devices, do you have a way to measure model degradation and update them? I think that’s really a question about measuring the accuracy and then updating.
Emrah Gultekin
Yeah. Can you repeat the beginning of that question? It had to do with the edge, though, right?
Jeffrey Goldsmith:
Sorry. Once models have been deployed to the edge devices, do you have a way to measure model degradation and update the models?
Emrah Gultekin
Yeah. It’s a good question, because this has been a problem with edge deployments today. There’s too much degradation on the edge, but in our system, whatever you see on the cloud is the same on the edge. There is no difference in terms of accuracy and in terms of performance. We’ve combined all that to one platform. If there is there gradation, you’ll see on the cloud as well, and then you have to update the model on the cloud, which was the same reflection of it on the edge. It’s a good question, but you’ll see all the metrics of that on the cloud itself.
Jeffrey Goldsmith:
And we do update our models on edge devices remotely. That’s all part of the platform, right?
Emrah Gultekin
Well, so if you update your model on the cloud, there’s a system update that it pushes it to the device automatically. So yes, that’s correct.
Jeffrey Goldsmith:
Yep. A couple more questions, one that was sent in chat, and you’re all welcome to keep asking questions. This is great. How does inferencing play into data set generation? And there’s a related question from the same attendee. Does the platform help with data annotation? The first one is how does inferencing play into data set generation and does the platform help with data annotation?
Emrah Gultekin
Yeah. It’s a great question. The full cycle of this, right, from data set collection, initial model training to deployment and then getting that back, getting that information back into data set collection is really crucial. Part of the inferencing that you do for the videos, and you’re able to define that, can go back to raw data set. That goes back to raw data set, and the annotation is done through … You can use smart annotation where there are object keys. We have annotators doing that all day long. There is a smart annotation tool where you can select object keys and it does the smart annotation for things that it already knows, right? But if it’s something that it doesn’t know, then you have to go back and relabel some of that. But yes, we do have that full cycle that goes back into the raw data and then it goes back into the data set. It is annotated partly by machine, partly by humans if necessary, and then the data set is updated. Once a data set’s updated, the model gets retrained. There is that cycle there. It’s a great question.
Jeffrey Goldsmith:
Yeah. Yeah. I’m rewording another question that is a little complex here. What is the minimum asset requirement that the system requires to build a custom model so that it can detect yes or no whether an object is visible in a frame? How many images do we need at the minimum to detect whether a single object is clearly visible or [crosstalk 00:28:32]?
Emrah Gultekin
Yeah, it really depends on the use case and whether the AI already knows that or not. And sometimes it’s as little as 30 images. Sometimes you need much more, maybe 500 to 1000, but it really depends on the use case.
Jeffrey Goldsmith:
Yeah. And once we see the use case, then we can define how much data is required. Let’s see here. I think we could go back and dive into this a little more deeply. What pre-trained models do we currently support on our platform? Maybe we should talk about the models that we have live now, Emrah. I think that that would-
Emrah Gultekin
Yeah. I mean, we have a 250,000 classes. We have over 115 models, which are available public models. We have over 4,000 models which are used by enterprises, so they’re private. They’re not open to the public. You can create your own models. You can share them. You can make them public. You can create your own data sets and so forth. There’s a lot out there, but the ones that are used the most, I would say are currently general object detection. You’ve got face, demographics, face-related ones. You’ve got a fall detection, human action recognition. You’ve got fire detection. You’ve got the handgun, rifle, weapon detection. Those are the ones which are being used a lot. Sentiment is being used a lot as well. But there are plenty out there, but it’s important if you start deploying these to start with a few and then move on to more complex use cases.
Jeffrey Goldsmith:
Yeah. And anyone who’s interested in the answer to that question can go to our site and sign up for free and dive into the model area of the website and have a look at all the different models that are available. And we’ll be constantly adding to this library based on the use we see. Anyway, Emrah, I think there’s one more question here. What are the coding requirements for model creation and integration? I think this could be our last question at this point. Do you want to answer one more?
Emrah Gultekin
Yeah. What are the coding requirements? Well, I mean, there are no coding requirements, but you do need to know Python if you’re going to integrate into a system that you already have. There is an API key so that you’re able to predefine the API that you want the predictions to go to, right? That’s the only bit of integration that you would need and maybe you’d need some Python skills for that, but it’s not a big deal.
Jeffrey Goldsmith:
Cool. Well, thank you very much, Emrah, for the presentation, and we’ll be sending around this recording and please get in touch if anybody has any further questions for us.
Emrah Gultekin
Yeah. Thank you everybody. Thanks for attending. Pleasure. Happy Friday. Have a great weekend, everybody. Thank you.