Webinars

Implementing Computer Vision Applications in the Real World

In this recorded webinar with VANTIQ and Chooch AI, you will learn about how computer vision, a subset of artificial intelligence, will impact digital business transformation and smart applications that can be deployed on the edge. From fall detection to advanced fire alerts, computer vision is revolutionizing our ability to digitize camera streams and use that data to create “smart environments” in the physical world. Presented by Emrah Gultekin, Co-Founder and CEO of Chooch AI, Brett Rudenstein, Vice President Sales Engineering and Services at VANTIQ, Patrick Burma, Senior Solutions Engineer at VANTIQ and Tifani Templin, Partner Manager at Chooch AI.

Tifani Templin:
Hello, everyone. Thanks for joining today. I’m Tiffany Templin. Our topic today is implementing computer vision applications in the real world with Chooch AI and Vantiq. Chooch AI is the leading platform for deploying computer vision to the edge, Vantiq is an agile full lifecycle development platform for rapidly real-time applications. With us today are Emrah Gultekin CEO of Chooch AI. Emrah give us a little wave.
Emrah Gultekin:
Hey everybody. Good morning.
Tifani Templin:
We also have Brett Rudenstein, vice president of sales, engineering, and services at Vantiq.
Brett Rudenstein:
Hi everyone.
Tifani Templin:
And we have Patrick Burma, senior solutions engineer at Vantiq.
Patrick Burma:
Hello.
Tifani Templin:
These gentlemen will be introducing you to both companies, and we’ll give you a small demo. After the presentation, we’ll be doing a live Q&A, so please post your questions in the chats. If for some reason we run out of time, we will make sure to post responses when we send the recording tomorrow. Emrah.
Emrah Gultekin:
Thank you, Tiffany. Good morning, everybody. Today, as Tiffany mentioned, we’re going to be talking a little bit about implementing computer vision in the real world and how these things really work and how we can make places smarter, so creating smarter spaces and a smarter utilization of those places for efficiency purposes. So I’m going to give you a little bit of background about our company first. Silicon Valley-based Visual AI company. So our mission has been from the start to copy human visual intelligence, the machines, and this is really about creating smarter spaces. And why are we doing this? Why are we trying to create smarter spaces? It’s because we want to create more efficient and safer and more equitable spaces at the same time. So this is this grand vision that we’re pursuing together within an ecosystem with partners like Vantiq. We’re a horizontal ready now platform, which means you can start using it immediately once you sign up and basically getting to that position of visual intelligence into machines, it’s really about the first step is detecting things, right?
Emrah Gultekin:
So for a machine to be able to detect objects, faces, actions and states, that’s really ground zero for this type of inferencing. And what we do is we ingest multiple image and video formats into our system, and we provide those inferences very, very quickly to users and to enterprises. So we do both cloud inferencing and also edge inferencing at the same time. And we’re going to go get into why edge is so important especially for video analytics. So our focus on edge AI computing is really crucial because what we’re doing here is we’re inferencing from models on premise with close proximity to the camera, and that’s really important for video analytics because streaming all that stuff into the cloud is just a no go especially if you’re at scale.
Emrah Gultekin:
So we provide real data for Vantiq to ingest into their platform and do more downstream analytics and also alerts and everything else that needs to be done to manage those large systems. Dataset generation, so in the life cycle of AI and we’ll get into that a little bit. Dataset generation is really crucial in model training, and that’s still done in the cloud today but then what happens is you push the model onto the inference engine on the edge so that the inferencing is constantly happening on these edge devices. You need to set up a GPU. We also run on CPU’s and these need to be scoped and configured and managed at the same time. So the camera deployment and [inaudible 00:06:44] we’ll get into some of those details a little later.
Emrah Gultekin:
So why is edge AI computing so important? And this is a crucial question that many of us ask, and also why we’ve pursued this path is… First of all, the most important thing is it works in disconnect environment. So if you have mission critical tasks that need to be done, you can’t do that on the cloud. It’s just not possible. There are disconnections, there are connectivity issues, there’s latency and so forth. So that’s the first thing, works in disconnect environment. Second thing, fast prediction so under 16 milliseconds per Jetson response, it’s a very fast… you need to have this type of fast inferencing on the edge, and basically we’re reducing cloud streaming costs. So streaming all this stuff into the cloud, especially videos and very large files and actually doing the inferencing at the same time on a GPU on the cloud, it’s almost impossible to scale that with today’s costs and there’s a lot of latency as well.
Emrah Gultekin:
And the other point of it is this is privacy, obviously. So if you have edge computing, all that data, that’s inferenced remains on-prem on the edge. And so data privacy and protection become easier in a distributed environment. And the final thing is light AI models. You’ve got models, you can swap the models, you can train new models and just deploy them and you can basically use these light AI models on any of the edge devices that you have. So I want to get a little bit into this about the AI lifecycle, because this gives some understanding of why it’s important to have all these on one platform. So they have three distinct buckets of processes in AI, these kind demystifying the process.
Emrah Gultekin:
So the first thing is you want to be able to train the AI, right? So you want to train the AI to be an expert in your specific field or your use case. So how do you do that? You first have to create a dataset, right over here. So dataset generation is important. Dataset generation is crucial to the training process. So you need to be able to annotate] label and do [inaudible 00:09:07]. The second part is training and that’s where you do the model training. So you choose a deep learning framework, you choose a neural network, and then you train the model. So training on the same platform is very important, and then finally, inferencing. Inferencing is also really crucial because that’s really the output that everyone cares about, is how is this AI detecting things and how is it doing analytics more downstream?
Emrah Gultekin:
So these three things are important to have on the same platform so that you can actually scale this type of process and this type of implementation. So here’s the Chooch basic coverage of what our system looks like. You have a dashboard, it’s where you inference, where you can train, you can do data set collection, you can do dataset generation, annotation and so forth. So, that happens up here on the Chooch dashboard. And then what happens down here is you have a small subset of the cloud dashboard on the edge. And that is where the inference engine is basically.
Emrah Gultekin:
So you have an inference engine on the cloud, but you also have the same inference engine on the device. And that device is connected to any imaging device, basically, so any video streams or cameras could be any type of imaging device that you have going into this inference engine, which is on-prem, and then it generates Jetson responses and better data. Approximately three to four, depending on what model you’re using and what type of GPU you have, it’ll do three to five Jetson responses per second. And then that goes into client environment and where it could do more analytics and you can send alerts and all kinds of stuff that you do. So this is basically how it works.
Emrah Gultekin:
Creating a device, this is part of the dashboard. So you basically go in, you create a device and here it suggests an AGX, San Mateo office and then basically what you’re doing is you’re pulling the public models into the device. So you can do any of these. And we have over 115 public models available. So this one focuses industrial safety. So detecting PPE equipments, improves worker safety, it’s used on construction sites, factories, and warehouses and you can do a lot of different things. This is the pre-trained PPE model, but you can also do no go zones and all kinds of stuff that you may need. Basically improving the safety of that workplace, and also aggregating that data over time so that you can basically lower insurance costs and reduce fraud and bias.
Emrah Gultekin:
Another one here is fall detection. This is also a public model and this is a broad range of applications, obviously detecting when people fall, warehouses, hospitals, nursing homes and this provides fast emergency response to anything that’s happening in those public spaces or those private spaces. Another one that’s popular is smoke and fire detection. So understanding smoke and fire visually, and this is really crucial because you may have a fire system already, but what this does, is it augments with visual detection as well. So with that, what we’re doing at Vantiq is we’re basically distributing the AI models onto different devices and providing these AI models everywhere basically, so that they can be managed. They can provide the detection, they can provide the alerts and also aggregate the data over time where you can do more analytics on it. And my friend Brett from Vantiq is going to continue and show how we work together and how their systems work as well. So I’m going to hand it off to Brett. Thank you.
Brett Rudenstein:
Thank you Emrah. And let me just share my screen. All right. What I’m going to do over the next couple of minutes is I’ll briefly describe what Vantiq is, how it applies to these vision-based models and streaming analytics in general. And then we’ll do a demonstration of combining the Chooch technology and Vantiq technology to arrive at very specific business outcomes. Vantiq is, as Tifani mentioned earlier on, a platform for rapidly developing real-time event driven systems. And events can be anything, they can be streaming data from IOT devices or as we’re talking about today, they can be streaming sensor data from the inference engine that basically allows us to create that situational awareness to recognize various opportunities or threats. One of the things that we’ve noticed as a company over the last couple of years is there’s been a fundamental shift in basically transforming the way businesses think about computing.
Brett Rudenstein:
And that is moving from perpetual ownership inside of a batch based world where we look at things like next best action, which are very good for certain kinds of analytical processes, but moving to a real-time world in which there are business outcomes that must be achieved, and they have to be acted on at the right time in the right place, all in a window where you capitalize on the opportunities and you mitigate the threats. We see digital twins coming out of these worlds where the digital twin can drive the physical world, or the physical world can inform the digital twin. And they have to be real time. They have to be able to identify those situations of interest to get the proper business outcomes. For example, one of the things that Emrah shared a moment ago was a video showing various kinds of fires. And one of those fires I noticed was on top of the stove, well, that’s a normal thing if it’s at a normal size and a normal rate.
Brett Rudenstein:
And so one of the things that you can do at Vantiq is distinguish between which fire is okay and which fire is not okay in those particular instances and build up those models very quickly. So in order to be able to do this, what the system is designed to do is three things fundamentally, to be able to sense, analyze and act on streams of data. That is to ingest data from any end point, for example the Chooch inference engine talks to an [MQTT 00:16:04] server. So picking up that data from MQTT or picking up directly over a web socket, we’ve got a number of ways to bring data into the system. The second area of course, is to be able to analyze that for situational awareness, those opportunities or threats that might be present based on the real-time streaming data, to be able to contextualize, aggregate, filter, and do all of these things in real time and in-memory without the bottleneck of traditional databases.
Brett Rudenstein:
And the last mile of this, of course, is the operational response. Once you’ve identified one of these situations of interest, you have to be able to arrive at a very specific business outcome, but in a real-time system you don’t have a traditional linear progression like you might find oftentimes in business process modeling or robotic process automation. Real-time data situations are constantly changing, and you’ve got to be able to iterate, you’ve got to be able to take the next best real-time action based on the current data at the moment. So Vantiq in this particular instance sits at the intersection of streaming sources, other systems, and of course, the people that interact with them. And so the fourth point to this, it has to be able to run anywhere. Chooch makes it so that you can easily run the object models, the vision models at the edge, and you can easily deploy them to the edge. But what about the business logic that has to go along with that?
Brett Rudenstein:
That can’t be all piped up to the single cloud, if you’re a casino and you’ve got a thousand cameras, you have to be able to do that at the edge as well. And so Vantiq is a fully distributed multi-tenanted environment that allows you to put that business logic right next to the inference engine, right next to the Chooch engine for maximum efficiency and for the lowest latency possible and of course, for the volume, velocity and variety of the data that is coming out of these engines and needs to be processed. If you were to build a real-time application on your own, typically this involves many different kinds of tools and therefore expertise across a lot of different kinds of application sets. You have to become an expert, so to speak, and the ability to developmentally speaking, tie all these different kinds of applications and messaging application, and persistent storage layer, a stream analytics layer, the web and mobile layers integrations with other systems and so forth.
Brett Rudenstein:
You would have to have the developmental expertise. There’s a certain level of complexity here, but then to keep these systems up and running with some number of nines of availability, there’s an operational complexity that has to be dealt with as well. So, different to what you’re seeing on screen now, Vantiq provides a single platform that is essentially in some instances, a no-code platform allowing you to rapidly use other things like the application builder and collaboration builders to rapidly build an a no-code way, these models that sense and analyze and act on real time data. But it also is a low code platform. You can get a little bit further into the covers and really create any business logic and distribute that were necessary. In fact, maybe better said it’s a full life cycle platform for event driven systems, including requirements, development, testing through unit and integration tests, and completely distributed environments, a deployment and partitioning tool to deploy across again, distributed environments and even monitoring and management tools to understand the current state of those running environments themselves.
Brett Rudenstein:
What Vantiq has done then is we’ve worked with our partners, working with our partners like Chooch to help them build smart applications and most recently working on a number of smart city applications. Vantiq use cases traverse a number of different kinds of verticals, if you will. They are real-time applications. They run across multiple distributed environments. They are completely agile, meaning that it is easy to modify them, it’s easy to deploy them and it’s easy to hide the complexity of these systems. The number of the use cases we’ve been involved with generally speaking are around safety and security, field service, digital twins, and that’s what I’m going to do now. I’m going to start doing a small demonstration, and then I’m going to turn it over to Patrick to complete the demonstration here. Let’s go ahead and take a look at Vantiq here.
Brett Rudenstein:
Share my screen again. So one of the things that you’ll see inside of Vantiq is a development environment as we make it very quick to build applications, but there are some other things that we do to make it even faster. What you’re looking at on the screen right now is what is called the Vantiq accelerator. This is a blueprint, if you will, that many of our customers will start with enable to basically take in the blueprint and start with a digital twin of a smart building or smart campus related use cases. And what we’ve done here is we’ve taken a couple of the different Chooch models. Actually, the three that Emrah mentioned a moment ago, we’ve taken the fire detection, the PPE detection and the fall down detection, and in less than a day, outfitted this into the Vantiq accelerator. And so one of the things that you will see here, as you already can see, we’ve got a series of buildings, floors, and spaces.
Brett Rudenstein:
So I’m going to show you how this works. Just going to close this out, because I want to recreate that issue. And so what we have inside the accelerator is essentially a mapping of a digital twin. Here you see we’ve got a number of buildings across a geographic landscape. If I drill down onto a particular building, you’ll notice that buildings contain floors and therefore floor plans, and then going into a particular floor, floors contain both spaces, the allocation of space. This is the office, this is the hallway, this is the stairwell, as well as the assets that live within those spaces. So for example, if I were to look at somebody’s assets and for here, you can see this IP camera. If I view the asset, you’ll see that it has a number of senses or at least one or more sensors or streams of data is another way to think about what a sensor is.
Brett Rudenstein:
And so here you can see that this has the Chooch 980 model, which is the firearms detection model. So let’s actually take a look at one of those models actually working live. I’m going to go over here to the main screen. I am going to bring up the camera that is pointed in this office building also going to bring up a mobile view. So here we can see the phone at the same time. And in just a couple Of moments, you will see that a fire is going to break out here in this particular office. So what’s going on here is the system is recognizing… you’ve just seen a notification on my mobile phone and the notification as I pick it up, you can see it says fire alert.
Brett Rudenstein:
It looks like the live feed is been updated yet and you should see that updated in a moment. But as you see the fire alert [inaudible 00:23:18]. As you see the fire alert over here, and I opened it up, you can see that it is detected the fire. It is basically said, it’s in the first floor records office, it’s in the first floor and building one, it’s got the location and I could acknowledge it.You’re probably also noticing that I’m getting a message or a phone call right now. If I don’t acknowledge the fire in a certain amount of time, then I’ve got this system placed in the phone call.
Automated:
A fire has been detected at building one, at first floor, at the first floor records office, please address the situation immediately.
Brett Rudenstein:
And so what you’ve seen here is basically a collaborative workflow that is basically using a series of escalations to not only identify and detect the fire reported. One of the reasons that the fire is detected a few moments after, as we’ve said, let’s make sure this isn’t an outlier in an anomaly or an anomaly. So, we have told the engine to in-memory analyze this and say, “I’m not looking for one inference of fire. I’m looking for six or more in a row in a particular area.” And the end of this particular story is that this particular building where we acquired this video from, because this fire wasn’t identified soon enough, even though it was caught on a particular camera, this fire ultimately burned this building to the ground. And this probably couldn’t have been avoided with just a burnt desk, this was a laptop that essentially exploded on the desk.
Brett Rudenstein:
Last thing I’m going to do here before I turn it over to Pat, is I’m just going to show you a little bit of the backend. Pat’s going to take you over personal protective equipment in a moment. And this is the application that is deployed to the edge. So this is the cloud application, the application that was built in the cloud, Pat walk through this and describe it in more detail. But this application identifies the PPE conditions for something missing. And then this PPE detection deployment basically allows us to tell the model where to go. So it’s going to go to Boston and Colorado. So each of these is in a completely distributed location and push down this inference engine. And hopefully you’ve gotten a good sense of some of the kinds of capability of Vantiq. I’m going to turn it over to Pat now to continue with the demonstration.
Patrick Burma:
Thanks Brett. Yeah. [inaudible 00:25:47] my video back up. So we have put together a live presentation for everyone. This will be a brief lab demonstration, and we’ll focus on the safety and security aspect of the use case. So what I’m going to put up on my screen here, it’s going to be a little busy, is a live view of the Chooch inference engine and what it outputs, which is running locally on my home network, a live view of Vantiq, which is running locally in my home network. So these are the edge instances, and I’m going to bring up my Vantiq mobile device on my mobile phone, just as Brett did. You can see the fire alert he just created so that we can see these notifications come in, in real time. And the point of this exercise is just to show everything working all at once.
Patrick Burma:
So we get an idea for what running this in real environment really means and why we run it at the edge as opposed to just pushing everything up to the cloud. Okay. So I’m going to go ahead and start my camera. My camera is based on motion detection, so it’ll take a second to fire up. And once it does, we’ll start feeding information into the Chooch engine, which is a real nice feature. So if you have a camera that goes to sleep, when it wakes up Chooch will just immediately start processing it as you see here. So for our PPE demonstration, we could have tackled this a couple of different ways. One of the ways would be what I consider the green-lighting scenario, where you have a person who steps in front of the camera, they put on their safety gear, or they already have it on and then they get green light to go into the work environment.
Patrick Burma:
The engine that we’re using from Chooch, the PPE engine to detect things like safety helmets and vests, gloves, mask, so we can apply this to a number of different use cases and solutions, even things like just normal lobby detection where we want people coming into the office, making sure they’re wearing a mask for COVID safety. And then we can green light those people. The way we’ve adopted the use case for this presentation is we assume that we’re monitoring a field of view, like a construction site that already has people in it. And we’re looking for violations and safety gear, and then we’re going to be alerting the foremen in real time. So let’s say we have someone out on the site and they take their hard hat off because, they’re eager to show everyone their COVID hair.
Patrick Burma:
What will begin happening is that the Chooch engine will start to detect hard hat alerts, but that alert isn’t going to be generated right away. And this is intentional. So what we’ve programmed into the application logic is a certain number of consecutive readings that have to occur before Vantiq actually triggers the hard hat notification and the PPE alert which you just saw. And this is so if there’s an obstruction in the view or someone takes their hat off for just a second, puts it back on, we’re not triggering a continuous stream of alerts. So we’re able to toggle the logic a little bit in order to prevent that. And then on top of that, we have one camera that’s running multiple inference engines. So this camera is actually running the PPE engine. It’s running fall detection, it’s actually running fire. Now it can’t actually generate a fire in my home office here, it’s not desirable. So I’m going to go ahead and trigger a fall down alert by falling down.
Patrick Burma:
And again, the tripling of the alert is based on a number of consecutive readings. We capture the image itself and we sent it into Vantiq and you guys can see on my mobile phone here, that these notifications have occurred very quickly and in real time. Now, as part of a real world environment, here we probably don’t want to instantaneously trigger any alarms or alerts. So what we’re getting is the site foremen an opportunity to review the information, maybe they’re onsite, maybe they’re off site and use their human intuition to determine whether this is a serious safety issue or not. Now in Vantiq, we can build very complex remediation workflow as Brett was showing you the phone call for the fire and stuff like this, so that our actions can be based on a number of acknowledgements or third-party system integrations, creating tickets stuff of that nature so that we can ensure that we’re in safety compliance. Whether it’s a corporate compliance, whether it’s COVID related compliance, or whether it’s OSHA compliance, we’ll provide the information to the users who need it and then allow them to take the appropriate actions.
Patrick Burma:
So we’re using the computer vision system in this case to help supplement human decision making in a way that is very real time, so that as these problems unfold, they don’t become more serious. So, as I mentioned, all this is running at the edge and because of the way that the two system works, we can then easily expand this into other use cases. For example, if we wanted to add social distancing, or we wanted to add firearms detection, or some other capability, we could easily incorporate that into our edge deployment through the way the Chooch is set up and configured, which I’m going to show and through how Vantiq does application development and deployment, which I’ll show a little bit as well. So we can start with a simple use case like fall detection and really expand our system with the same equipment, same camera, the same edge computers and take advantage of those for additional use cases.
Patrick Burma:
So I’m going to walk a little bit backwards from the end application that you just saw, which is the PPE and fall detection and alerting through the initial deployment so that you get a feel for what a real world deployment looks like from a bit of an end-to-end basis. So what we’re doing here is really doing all our business logic that you see at the cloud side. So all that information is coming in from Chooch, and I’ll discuss how we get information into each system, so from Chooch into Vantiq at the edge, and then we start to process that information and look for those different scenarios, like fall down or missing safety equipment. And then we take the relevant information and we send it up to the cloud. And what this means is at the edge, there’s a ton of transient data being generated that isn’t being sent to the edge. So we don’t have to worry about bandwidth or latency concerns with that, all the vision processing, and then all the heavy data process that’s going to happen at the edge.
Patrick Burma:
Now in the cloud, Brett showed you what we call our Vantiq accelerator and the Vantiq accelerator is a template project that we’ve created that we offered to our customers for free that provides a starting point for this digital point type interface. And in this digital twin interface, we can easily onboard in this case sensors, which are your devices. These could be cameras, these could be other types of sensors, and then we can incorporate them into a digital twin interface. And that could be taking advantage of many different things. So, for example, if you wanted to do location tracking of a person you could have a BLE sensor. Those BLE sensors aren’t necessarily that accurate. So maybe we are we’re going to increase the accuracy of the BLE with the computer vision AI engine. And then we can put all that into this digital twin and we can in fact do tracking of that through this interface.
Patrick Burma:
So we have actually these badges representing people, and we could replace this with a more sophisticated 3D model and in some of our use cases, we even replaced this with 3D models of different objects, people, vehicles, stuff like that. So if you have privacy concerns about using camera data, the camera data in any imaging could just be completely thrown away or not generated at all. And then we can actually enhance our perspective using digital twin interface through the computer vision system or through combination of other systems and sensors that you might include in that environment. So that’s how we would address things like PDI and privacy concerns. And there’s other technologies out there that we can take advantage of like face blurring and stuff like that. All right. So additionally, since all the critical alert information is going up into the cloud, we can store that information and we can use it for some historical analytics. And here you see, because we’ve been running this demo quite a lot lately, we’ve had quite a number of fires going off.
Patrick Burma:
In our buildings, this is a very unsafe environment if you’re looking at this chat seriously, but what we’re able to do is capture those events, allow someone to go to a dashboard interface and review the historical information, previous incidents that occurred as well as historical incidents and the value of the historical information is knowing where problems occur and what type of problems occur. So if this is a field situation, or if this is an indoor scenario, or maybe it’s even a COVID scenario and people are trying to sneak by the desk without masks, we’d be able to isolate where those issues occur most frequently, what types of issues and then we can go in and try to address those problems to reduce it and make our workplace more safe environment.
Patrick Burma:
So that is the application front-end. So, we do assume that most people won’t be sitting here staring at a dashboard waiting for these issue alerts to come in, which is why we rely on mobile alerts, which we’ve shown you coming in the form of push notifications. But we can also do text alerts, a [inaudible 00:35:58] notifications.
Patrick Burma:
You saw the phone call coming from Brett’s email alerts, whatever type of alerts you want, those remediation workflows can vary from very simple, to very complex. So now I’m going to get a little bit into the system set up and how this thing got built and deployed. So from a vantage perspective, everything we do is developed in the cloud. Earlier, I was showing you a view of antic, really a developer view of antic in my local environment, now I’m showing you the same thing in a cloud environment. And when we were building this presentation, this is where Brett and I would go to develop the actual application logic. And what’s neat about this, as you guys see this gooey application here and I know you’re probably thinking, it looks a little bit like business process management or data flow, but what it really is, is a gooey representation of event driven applications.
Patrick Burma:
So each of these little blocks and diamonds that you see represent chunks of code, which we call procedures, you can think of them like functions or lambda functions that are executed whenever an event comes in. So when the Chooch inference engine sends us a message that triggers the code execution, and this all happens asynchronously. So we could have 50 cameras sending us messages all at the same time, and that would initiate 50 code executions at that first layer all at once. And that’s the basis of a venture in architecture and how serverless systems work because your resource utilization expands and contracts based on how many events are coming into the platform.
Patrick Burma:
Now in this gooey application that I’m showing here, there’s a combination of pre-written code that is provided by Vantiq that provides some programming logic that’s very common to event processing like filtering and splitting messages and unwinding arrays and then we also supplement that with code that we write ourselves and that we work into here. So anytime we need custom business logic, we just write the code by hand, and then we incorporate it into this gooey application and this is what ends up getting deployed to the edge. Now, these arrows here don’t actually represent a business process flow as I mentioned, what it really is showing you is a outgoing event that is being published and the next application stage is subscribed to that event stream, which is causing the execution of that downstream event.
Patrick Burma:
So that’s the visualization of the event driven application model, which is neat. And what that means is you can have multiple subscribers from a single publisher, just like any pub sub system, which is how you see something like this, no hard hat splitting into two directions. So we can have multiple asynchronous operations growing like a database right that’s happening on one thread while on another, there’s a business analyst for looking for certain texts a lot like we are in the Chooch message. So that’s how Vantiq works at its core and we take advantage of this graphical interface to do very complex things like combining the truth AI inference engine data with something like a BLE badge tracker to create that highly accurate location tracking, which we then put into the gooey client, which has also built through the Vantiq interface which we’re not going to get into for this presentation.
Patrick Burma:
Okay. So for the actual deployment, here’s how it works. On the Chooch and Vantiq side we both offer a MQTT connection and MQTT is a message broker acts as a published subscriber middleman between a lot of IOT devices, these days use it in other types of systems and we are using it because both systems provide that out of the box and allows us to take advantage of a very simple way to connect to each other. So what we do is on the Chooch side, we have a configuration interface that allows us to specify the MQTT broker we want to send to. Now, there is no MQTT broker at the beginning of all this, right? There’s nothing, there’s no Chooch, there’s no Vantiq, the edge is just a system that’s that doesn’t yet exist. So we provide an alias, we call it MQTT.local.
Patrick Burma:
And then we also need one more piece of information to get this to work, which is the IP address of your camera stream, which you can see here and then you can add the models that you want in for that camera stream. And all this can be fully automated with Chooch through their rest API. So the onboarding process for a new camera could be as simple as punching in the IP address into a mobile app and all this gets set up for us. On the Vantiq side, we do something very similar. We have what we call source connections, which allow us to connect to something like a MQTT broker and then it’s obviously not running because this is the cloud and the MQTT broker is going to get deployed to the edge or we use the same credentials, right? We have the same MQTT.local alias, and this is going to be how the information gets from Chooch to Vantiq.
Patrick Burma:
So the next part of this is to actually deploy these systems and set them up and start them running. And because Vantiq and Chooch intrude, both provide Docker images, we can do this whole thing with Docker. So the way we’ve configured Docker is we’ve got an environment that includes the Chooch image, the Vantiq image, now this is an example of the environment that we use for Chooch that’s currently not running on my laptop here. We have Mango, which is the persistent storage layer for Vantiq and then we have Mosquito, which is the MQTT server. In our Docker configuration, and we simply give that Mosquito server, the host name of our MQTT alias, so that way when I turn this on Chooch can talk to it immediately, Vantiq can talk to it immediately. Everything is set.
Patrick Burma:
So the deployment process itself for say an enterprise IT environment would be to use something like Kubernetes and some enterprise level container management system to deploy these images in this particular configuration and when they go out, they’re just automatically up and running, which is really nice capability. So the deployment for the application logic still has to be done. Vantiq, Brett showed you has a deployment configuration. The PPE one is very simple. It just has the PPE logic. When I want to deploy to all my edge instances, in this case we just have two, Brett’s office and my office, but I could have hundreds of them. I just hit that deploy button. And the onboarding process for additional locations is based on a tagging system. So I can give them all a tag like edge device and it’ll deploy to all those edge devices, whether it’s one or whether it’s 10,000. Now that application logic has been deployed there. And because Vantiq is also less addressable, this entire process can be fully [inaudible 00:43:00] as well.
Patrick Burma:
So the onboarding of the underlying images, the onboarding of the application logic the connection between the systems for the MQTT broker can be 100% fully automated. The only thing we need to provide is the edge systems itself and that connected IP camera. [crosstalk 00:43:21]
Brett Rudenstein:
Looks like we’re at a couple of minutes and just have enough time for a little bit of Q&A.
Patrick Burma:
So, okay. Let me stop my screen sharing and we can slide into the Q&A phase.
Brett Rudenstein:
Turn it back to Emrah.
Emrah Gultekin:
Thank you, Patrick. Thank you, Brett. I think we have a few questions. Jeff, Tifani, I can address some of these regarding accuracy. This is the golden issue with these models. So basically we try not to deploy models that have an accuracy F1 score less than 0.9. So 90%. And that’s the harmonized mean between precision and recall, or precision is usually much, much higher, like 95 and sometimes the recall is less. What we’re seeing is that a lot of the enterprises and partners using these systems are more affected by false positives rather than false negatives for these types of scenarios, so these types of use cases. But at the end of the day, it is a platform so you can train your own model. You can annotate yourself, you can test yourself, you can generate these reports on the system yourself and deploy yourself if you’re comfortable with the accuracy of that model and you can iterate on those models as well. So sending more data in, annotating those have a smart annotation, have video annotation, annotate them, and then upload those into the model and train them again and reiterate on your models is really crucial.
Emrah Gultekin:
The issue with accuracy, it needs to be dynamic, right? So, this whole life cycle of data sets, model training, and then inferencing. If you don’t have all of that, you’re never going to know the accuracy of your model. So you need to be able to inference on the spot. So you do the inferencing on the cloud, you generate F1 scores, and then if you’re comfortable with the accuracy for that particular use case, you deploy it on the edge. I think there was another question about [Azure 00:45:39] or what the difference is. Basically the main difference is edge deployment of the models. As far as we know there isn’t anything in production yet from any of these other big players especially for edge deployments. What we’re talking really here about is an enterprise system. It’s not really about algorithms. It’s not really about models. It’s not really about developer tools. It’s about enterprise system that could be easily plugged in and used.
Emrah Gultekin:
I go back to this thing, you know we had [Apple Toohey 00:46:22] in 1985. We sent our first message to my friends two blocks down in 1986 and we thought we had solved messaging, but 30 years later you still have messaging developing. And we’re at the beginning of phase of this. So it’s very early in this process. So I wouldn’t get too hung up on the algorithms. It’s not about algorithms or models itself. It’s about an entire system and it’s even more than that. It’s about an ecosystem working together like we are with Vantiq. Brett and Patrick, do you want to say anything about accuracy?
Brett Rudenstein:
With respect to… sorry-
Patrick Burma:
In our perspective and one of the things we try to do from an application developer perspective is take into account accuracy and the computer vision systems can be in some cases really accurate, like 99.9% accurate, but they’re never really a hundred percent accurate. So when we build and design applications, like we demo today, we try to add in some elements to supplement those accuracy numbers. But of course, in the demo that you saw today, we’re using public models provided by Chooch, typically you want much more accurate models that you would train with custom datasets and they would be more reliable and we would have to have less manual intervention and more automation.
Brett Rudenstein:
Maybe the other thing I’d say is, it’s also important to understand what the angles of your cameras and things of that nature. So some models are trained at a more direct level. And you’ve got to use the right model for the right application to get the highest level of accuracy out of it. When one model that detects cars, isn’t a model to detect all cars, because you might have aerial or drone views that are trained at higher angles and therefore are better at detecting them, which is why a lot of times when you see public models, the Coco models and whatnot, if they look at a car from the top down, they think it’s a mobile phone, you’ve got to use the right model to get the right accuracy.
Emrah Gultekin:
I want to add to what Brett and Patrick just said. So the business logic afterwards, so having consecutive hits at certain percentages and that whole business logic is really crucial to a use case. And that’s what Vantiq has done, is they’ve developed more rules-based rules engines on top of the Jetson response that’s coming out from the model. The model itself is it provides a Jetson for that particular frame. But, what you really need is something that’s consistent with your use case. So if you’re very sensitive to fires and you see just a spark and you want to be alerted your rules engine has to reflect that at work. If you’re not that sensitive and you have like sparks flying all over the manufacturing plant, because that’s part of your operation, then you want that rules engine to be more flexible. So this is all part of that business logic that comes after the fact of the AI predicting something from the deep learning frameworks.
Brett Rudenstein:
Yeah. This is where the vantage piece comes in. And as I was alluding to it earlier, if you’ve got a fire on the stove top, that’s probably pretty normal to have a fire on the stove top, but when the bounding box gets to a certain level, that indicates that it’s more of a problem. And so I don’t know if it’s necessarily an accuracy issue, as much as it is identifying the correct situation of interest. Fire isn’t necessarily a situation of interest, the kitchen is on fire is the situation of interest.
Emrah Gultekin:
And then really what are you comparing it with? You’re comparing it with a human sitting there and the human accuracy is around 60% actually because of lots of different things and you can’t scale humans. So it’s a zero or one, it’s binary either you can use these types of applications to better help and aid support your processes, or you don’t use them at all. And then use humans to do it.
Jeffrey Goldsmith
A general question, we’re getting some questions about what kind of coding languages are used for Chooch. I see another question about using data processing retrain models.
Emrah Gultekin:
Yeah. It’s mainly written in Python and we’ve used a number of different deep learning frameworks from TensorFlow to PI torch to [Gloom 00:51:39]. It’s all part of the system. We generally use the Rez nets. So it’s a cluster inside the system. And then what happens is depending on the data that you provide or the context and the application that you’re using, the system automatically distributes them to the right frameworks. Deep learning frameworks are tricky because training is one thing and then inferencing is another and inferencing is a very hairy issue. So getting those inferences down, getting those models down to the edge requires other types of tools like TensorRT and so forth.
Jeffrey Goldsmith
We have a very interesting question about microscopy, which is something we’ve addressed. And an attendee is asking, is it purely object detection or can you deploy instance segmentation, both counting and feature measurement? We’ve done that, haven’t we Emrah?
Emrah Gultekin:
Yeah. So hierarchically, it all boils down to an object detection problem. So even segmentation is an object detection problem. So you can’t outline the object, you can’t really segment it. So the system has object detection and segmentation. And then what it does, is it does image classification on top of the object detection to classify those things. So it’s a multi-layered process. Object detection, models themselves or object detection tools themselves are not accurate enough to deploy in the real world. They’re just not, you need to have a layer of image classification that goes with it. So yes, that is part of what we do. We haven’t done measurement but we have addressed it in past conversations.
Jeffrey Goldsmith
We have another interesting question from an attendee. Do your models allowed for nuance creating or of detected events fire, no fire or small, medium, large fire, for example?
Emrah Gultekin:
Yeah. It’s a good question. So the detection is done on a model that has classes in them. So the classes are… and you can have thousands of classes in a model, or you can have two classes in a model. The one that’s public right now is fire, either fire or no fire, but we have to deploy where there are defects, for example, like a small defect, a medium defect, large defect, and those are different classes within that model. So it depends on how you annotate and label your dataset. So if you label your dataset with small, medium and large fires, and then you train it that way, you’re going to get that output that way.
Brett Rudenstein:
These also speaks a little bit to what you have to do after the inference itself because you could have… a lot of things will depend on how close the camera is to the object. And so you may want to make a determination, a candle fire is probably fine. It’s a candle and it’s expected to be there. And so being able to make a differentiation between what is one thing, what is another, and then contextualize that all into the… okay, this is a problem, or this is fine, is an important part of the post inference process.
Patrick Burma:
I was going to add a lot of the programming, Brett you know I have done in the past on this issue is determining scale because when you’re looking at an image based on the angle of the camera and where things appear in it, something close up looks very big, something far away, looks very small. So there’s a lot of scale you have to account for in those type of detections where you’re trying to figure out things like size.
Brett Rudenstein:
Or if you’re trying to determine GPS quarter, and it’s based on a two dimensional image, things of that nature.
Jeffrey Goldsmith
We have a question about a processing power needed on the edge for these sorts of deployments. Does anyone want to answer that? What kind of devices do we need?
Patrick Burma:
It’s one of those loaded questions, because the answer is always going to be a depends on the use case, but it varies from let’s say you want bare minimum to something maybe a little more powerful than a raspberry pie or one of the Jetson’s Xavier type systems to something very high end with multiple GPU units in it. So it does really depend on the use case, but the word we always use is right sizing. And you want to take into account that you may have systems that can process multiple cameras at once, things like that. What are frame rate or detection is? How frequently we want to send things like images to the cloud and how [inaudible 00:56:40] been whether they’re impacted by that. There’s a number of things we consider when we make that assessment.
Emrah Gultekin:
I think there was a question about GDPR privacy. You want to address that. So it’s a good question because one of the reasons we moved all of our inferencing to the edge was because of GDPR and PPI. So we work in the healthcare business so we power smart operating rooms and that information can not leave the surgery room basically. So that was a trigger for us to put all of these inferencing on the edge. So we don’t have access to the edge dashboard that the clients have, nobody has access to it. It does system upgrades from time to time, if the user wants to, it does model updates from time to time, if the user wants to do that. There is no backflow of information into the cloud as far as Chooch is concerned, but what you do see is… and also we don’t store any of that information. We store the last 500 predictions and then it’s erased from the device. So, that’s how we are addressing those issues of privacy in it. And it’s all distributed at the same time. So, that’s one thing.
Emrah Gultekin:
Training data is more sensitive and that training data, our GDP are compliant on the cloud for that training data. But all the inferencing, we don’t see any of that, it never goes to the cloud. It’s all on the edge.
Jeffrey Goldsmith
We had two more questions but we’re almost at the top of the hour. One is about partnering with us as a configuration as a team providing sandbox or client conversation and a demo of PRC area. And we can do that. Another question about smart shopping solutions, it’s very detailed. Does anyone want to talk about smart shopping or how we demo for this? Someone else to ask about next steps. I think we should probably move to that because we’re at two minutes before the hour.
Brett Rudenstein:
I can just quickly address. We’ve done some smart shopping applications. Some of them are vision-based, some of them are not, some of them are based on weight and a combination of weight and vision, vision sometimes not being a camera sometimes being like barcode and things of that nature, but some of them are also purely vision-based. So, looking at groceries as they come down a conveyor belt and determining, okay, that’s broccoli, that’s an apple and so forth and assisted vision assistance for the tellers and cashiers and things of that nature. Some of them are more automated, so we do have some experience in those arenas and certainly happy to talk more about it at another time.
Jeffrey Goldsmith
Yeah. And yes, we do provide POC and demo areas. You can sign up for Chooch AI from our website and explore the dashboard, start to build things and we can support you in that.
Brett Rudenstein:
As do we.
Jeffrey Goldsmith
Yes.
Tifani Templin:
Yeah. If our intent is to send the recording of this and also follow up with those questions that we didn’t get answered, I believe. Anyone who asks for specific reach out or what are next steps I believe we have your contact information and we’ll be in touch very soon. We’re here at the top of the hour.
Jeffrey Goldsmith
Yeah. Thanks for attending everyone. And it’s been a great discussion. Thanks Vantiq. Thanks Emrah for the presentation.
Brett Rudenstein:
Thank you. A pleasure.
Patrick Burma:
Thank you guys. Appreciate it.
Emrah Gultekin:
Thanks everyone for joining.
Brett Rudenstein:
Bye.
Emrah Gultekin:
Bye.

Share
TwitterLinkedIn