Artificial Intelligence and Machine Learning (AI/ML) have arrived in a big way. What was once considered a strict academic pursuit (0$ spend in 2013) is now a critical component of business strategy across several industries (more than $50B spend in 2020).
One significant contributing factor to AI adoption is the cloud. With easy to use tools and frameworks, the cloud offers data and compute intensive services at a massive scale yet at an affordable price.
But developing and deploying AI/ML models in the cloud is not always smooth sailing, in this blog we will talk about what holds enterprises back from realizing their AI potential and how Alkira’s solution and technology helps them overcome some of these barriers.
AI/ML in a nutshell
Machine learning (sub-domain of AI) is when computer models learn and discern patterns, trends and correlations in data without explicit programming.
Traditional programming takes well defined inputs, processes them on predefined requirements and produces a desired output. Machine learning on the other hand takes well defined inputs and outputs, uses algorithms (linear regressions, neural networks etc) to self create programs (or models) and deduce output for any new inputs (as shown in the loop above).
Input data is often very large, structured or unstructured and comes in various shapes and forms. Examples include media (digital photos, audio, video), documents (spreadsheets, log files, emails), mobile communications (instant messaging, chats, collaboration software) and IoT(sensor, ticker). The models look deep into these disparate datasets and provide predictive and prescriptive insights that humans typically can’t do. This unlocks a plethora of use cases like fraud detection, customized recommendations, healthcare analysis, asset optimization and much more.
Following diagram depicts a typical AI/ML model lifecycle
The first and the most critical step is to acquire, prepare, label and manage large datasets. Depending on the application and the use case, appropriate ML models are developed and produced. Sometimes models aren’t written from scratch, pretrained models with related tasks (image recognition, sentiment analysis) are fine tuned to match current expectations. And multiple models get developed and tuned at the same time for the same use case, each individual model then gets evaluated for accuracy, precision (F1 score) and performance with some well defined preset metrics. The winning model gets deployed in a production environment, gleaning valuable insights from new data. With change being the only constant in the real world, the model is continuously monitored and measured using new outcomes. It is tweaked, updated and is eventually replaced when outputs no longer match business goals and objectives.
AI and Cloud
As can be seen, coming up with tangible ML models involves running many open ended experiments iteratively using intensive compute and storage operations. With proprietary hardware and software, the costs add up significantly in a short time with little or no rewards . The cloud offers reprieve, here are some benefits to use clouds to fuel your AI/ML models:
- Pay-per-use pricing, enormous IT resources required to build ML models can be shut down once processing is done
- Elastic infrastructure that can scale up and down depending on AI/ML workload burstiness
- Instant access to ML optimized resources, large scale data stores and compute resources (GPUs/optimized VMs) can be acquired and accessed immediately without tons of upfront costs
- Pre-built algorithms and pre-trained models that can be used for faster innovation
Also in recent years, the cloud providers have done some heavy lifting and offer various services for well known AI/ML problems, below is a brief summary:
AI/Cloud Challenges
The solid foundation required to build a robust ML model is data, and for a terrain as difficult as ML modeling, the foundation needs to be even sturdier. No matter how sophisticated the ML algorithm is, the results are directly tied to the quality and the quantity of the data that fuels it. And a lot of the data the organizations model is at the core of their business, the data typically includes consumer’s sensitive personal and identifiable information, health and financial records. Leveraging cloud services then raises the following concerns and needs:
- Given the widespread access of the public cloud, the security risks associated with sensitive data gets amplified exponentially in the cloud
- Regulations like PCI DSS and HIPAA require organizations to strictly limit access to protected data (credit card data, healthcare patient records) to only certain employees. This requires organizations to segment and isolate the network so employees with only legitimate needs can access the data
- Enterprises store data in silos, in the cloud, on-premises and HDFS clusters. And not all AI/ML use cases are alike, organizations prefer (and require) a flexible hybrid cloud environment which meets all their AI/ML demands
- Reliable network infrastructure that transfers high velocity data with minimal latency to implement real time transactions and analytics models
- Ability to quickly detect and fix hotspots and blindspots in the cloud environment so that AI/ML models continue to thrive
Deploying AI/ML models in the multi cloud can look daunting at first, but with Alkira, you easily CAN.
Alkira – Cloud Area Networking
Alkira Cloud Area Networking is the industry’s first low latency, high performance global hybrid cloud network with security and application solutions, all offered as a service. Using an intuitive canvas or using IaC network automation, customers can get their branches, data centers, clouds and remote access users seamlessly and securely connected within minutes. Alkira’s unified management platform makes it very easy to deploy AI/ML models in a multi-cloud environment with all necessary security and policy controls. Here is how this can be accomplished:
Network Security Marketplace
Customers can choose from a wide range of security providers in the Alkira marketplace, the service is intelligently inserted and integrated into the cloud environment without additional routing overhead. All sensitive data that travels the length and breadth of the network is secured, sovereignty of the data isn’t lost. And as the volume and velocity of the data fluctuates, the instances scale up or down making the security (like compute, storage) truly elastic.
Network Segmentation
Network segmentation is the ability to compartmentalize a network into smaller domains, so different policy and routing controls can be applied on each of the domains individually. Native cloud constructs offer no segmentation, this makes data compliance and governance very difficult and very complex. With Alkira, network segments can be centrally carved out for the whole hybrid cloud making data compliance very easily achievable.
High Performance Network Connectivity
Alkira offers the best hybrid cloud infrastructure in the industry with an ultra high speed, low latency unified network backbone. Customers securely connect to the nearest Alkira point of presence (Cloud Exchange Point), with a fully meshed network backbone, live data can be used to drive real time ML models very efficiently. With Alkira’s architecture, geography is no longer a barrier, ML models can be globally present and do not have to be replicated to overcome hybrid cloud latency limitations.
Intuitive Visibility
Alkira provides an intuitive and holistic visibility into the entire multi cloud ecosystem. The portal presents a detailed view of network and application level statistics, a route visualization dashboard portraying the health of the control plane and several policy driven metrics. The solution also actively monitors network endpoints with synthesized probes; actionable alerts are sent so preventive or corrective actions can be taken immediately on anomaly or failure detection. The architecture ensures that a ML model never suffers performance degradation, data is made accessible at all possible times.
Resource Sharing
Data is increasingly getting distributed and siloed, to get the most accurate results from the ML model, the data needs to converge. Alkira offers resource sharing, the ability to selectively share certain resources from one segment to the other. A ML model can be trained and deployed in its own segment, data from different segments can then be shared to the ML segment, the model can then work on the whole dataset to produce a complete and accurate outcome.
Machine learning Ops (MLOps)
MLOps is the combination of Machine Learning and DevOps, and uses CI/CD practices to deploy ML models in production environments. MLOps not only aims to automate code deployment, it also aims to automate the collection of new data, retrain the model and analyze the results. Alkira offers a robust IaC offering, integrating Alkira’s IaC into the MLOps pipelines greatly simplifies new data collection and model retraining.
Conclusion
AI/ML castle can and should be built in a hybrid cloud environment, and Alkira provides the best architecture for it. With Alkira, the castle’s foundation is strong, walls well fortified with abundant sunshine and visibility into all of the castle’s quarters and facilities.
To learn more about Alkira’s solution https://www.alkira.com/
Take your own tour of the Alkira solution https://www.alkira.com/virtual-tour
To request your personalized demo https://www.alkira.com/demo