Home » Technical Topics » Machine Learning

Distributed Artificial Intelligence with InterSystems IRIS

8753372271

What is Distributed Artificial Intelligence (DAI)?

Attempts to find a €œbullet-proof€ definition have not produced result: it seems like the term is slightly €œahead of time€. Still, we can analyze semantically the term itself €“ deriving that distributed artificial intelligence is the same AI (see our effort to suggest an €œapplied€ definition) though partitioned across several computers that are not clustered together (neither data-wise, nor via applications, not by providing access to particular computers in principle). I.e., ideally, distributed artificial intelligence should be arranged in such a way that none of the computers participating in that €œdistribution€ have direct access to data nor applications of another computer: the only alternative becomes transmission of data samples and executable scripts via €œtransparent€ messaging. Any deviations from that ideal should lead to an advent of €œpartially distributed artificial intelligence€ €“ an example being distributed data with a central application server. Or its inverse. One way or the other, we obtain as a result a set of €œfederated€ models (i.e., either models trained each on their own data sources, or each trained by their own algorithms, or €œboth at once€).

Distributed AI scenarios €œfor the masses€

We will not be discussing edge computations, confidential data operators, scattered mobile searches, or similar fascinating yet not the most consciously and wide-applied (not at this moment) scenarios. We will be much €œcloser to life€ if, for instance, we consider the following scenario (its detailed demo can and should be watched here): a company runs a production-level AI/ML solution, the quality of its functioning is being systematically checked by an external data scientist (i.e., an expert that is not an employee of the company). For a number of reasons, the company cannot grant the data scientist access to the solution but it can send him a sample of records from a required table following a schedule or a particular event (for example, termination of a training session for one or several models by the solution). With that we assume, that the data scientist owns some version of the AI/ML mechanisms already integrated in the production-level solution that the company is running €“ and it is likely that they are being developed, improved, and adapted to concrete use cases of that concrete company, by the data scientist himself. Deployment of those mechanisms into the running solution, monitoring of their functioning, and other lifecycle aspects are being handled by a data engineer (the company employee).

An example of deployment of a production-level AI/ML solution on InterSystems IRIS platform that works autonomously with a flow of data coming from equipment, was provided by us in this article. The same solution runs in the demo under the link provided in the above paragraph. You can build your own solution prototype on InterSystems IRIS using the content (free with no time limit) in our repo  Convergent Analytics (visit sections Links to Required Downloads and Root Resources).

Which €œdegree of distribution€ of AI do we get via such scenario? In our opinion, in this scenario we are rather close to the ideal because the data scientist is €œcut from€ both the data (just a limited sample is transmitted €“ although crucial as of a point in time) and the algorithms of the company (data scientist€™s own €œspecimens€ are never in 100% sync with the €œlive€ mechanisms deployed and running as part of the real-time production-level solution), he has no access at all to the company IT infrastructure. Therefore, the data scientist€™s role resolves to a partial replay on his local computational resources of an episode of the company production-level AI/ML solution functioning, getting an estimate of the quality of that functioning at an acceptable confidence level €“ and returning a feedback to the company (formulated, in our concrete scenario, as €œaudit€ results plus, maybe, an improved version of this or that AI/ML mechanism involved in the company solution).

8752114479

Figure 1 Distributed AI scenario formulation

We know that feedback may not necessarily need to be formulated and transmitted during an AI artifact exchange by humans, this follows from publications about modern instruments and already existing experience around implementations of distributed AI. However, the strength of InterSystems IRIS platform is that it allows equally efficiently to develop and launch both €œhybrid€ (a tandem of a human and a machine) and fully automated AI use cases €“ so we will continue our analysis based on the above €œhybrid€ example, while leaving a possibility for the reader to elaborate on its full automation on their own.

How a concrete distributed AI scenario runs on InterSystems IRIS platform

The intro to our video with the scenario demo that is mentioned in the above section of this article gives a general overview of InterSystems IRIS as real-time AI/ML platform and explains its support of DevOps macromechanisms. In the demo, the €œcompany-side€ business process that handles regular transmission of training datasets to the external data scientist, is not covered explicitly €“ so we will start from a short coverage of that business process and its steps.

A major €œengine€ of the sender business processes is the while-loop (implemented using InterSystems IRIS visual business process composer that is based on the BPL notation interpreted by the platform), responsible for a systematic sending of training datasets to the external data scientist. The following actions are executed inside that €œengine€ (see the diagram, skip data consistency actions):

8752126489

Figure 2 Main part of the €œsender€ business process

(a) Load Analyzer €“ loads the current set of records from the training dataset table into the business process and forms a dataframe in the Python session based on it. The call-action triggers an SQL query to InterSystems IRIS DBMS and a call to Python interface to transfer the SQL result to it so that the dataframe is formed;

(b) Analyzer 2 Azure €“ another call-action, triggers a call to Python interface to transfer it a set of Azure ML SDK for Python instructions to build required infrastructure in Azure and to deploy over that infrastructure the dataframe data formed in the previous action;

As a result of the above business process actions executed, we obtain a stored object (a .csv file) in Azure containing an export of the recent dataset used for model training by the production-level solution at the company:

8752130894

Figure 3 €œArrival€ of the training dataset to Azure ML

With that, the main part of the sender business process is over, but we need to execute one more action keeping in mind that any computation resources that we create in Azure ML are billable (see the diagram, skip data consistency actions):

8752140875

Figure 4 Final part of the €œsender€ business process

(c) Resource Cleanup €“ triggers a call to Python interface to transfer it a set of Azure ML SDK for Python instructions to remove from Azure the computational infrastructure built in the previous action.

The data required for the data scientist has been transmitted (the dataset is now in Azure), so we can proceed with launching the €œexternal€ business process that would access the dataset, run at least one alternative model training (algorithmically, an alternative model is distinct from the model running as part of the production-level solution), and return to the data scientist the resulting model quality metrics plus visualizations permitting to formulate €œaudit findings€ about the company production-level solution functioning efficiency.

Let us now take a look at the receiver business process: unlike its sender counterpart (runs among the other business processes comprising the autonomous AI/ML solution at the company), it does not require a while-loop, but it contains instead a sequence of actions related to training of alternative models in Azure ML and in IntegratedML (the accelerator for use of auto-ML frameworks from within InterSystems IRIS), and extracting the training results into InterSystems IRIS (the platform is also considered installed locally at the data scientist€™s):

8752141088

Figure 5 €œReceiver€ business process

(a) Import Python Modules €“ triggers a call to Python interface to transfer it a set of instructions to import Python modules that are required for further actions;

(b) Set AUDITOR Parameters €“ triggers a call to Python interface to transfer it a set of instructions to assign default values to the variables required for further actions;

(c) Audit with Azure ML €“ (we will be skipping any further reference to Python interface triggering) hands €œaudit assignment€ to Azure ML;

(d) Interpret Azure ML €“ gets the data transmitted to Azure ML by the sender business process, into the local Python session together with the €œaudit€ results by Azure ML (also, creates a visualization of the €œaudit€ results in the Python session);

(e) Stream to IRIS €“ extracts the data transmitted to Azure ML by the sender business process, together with the €œaudit€ results by Azure ML, from the local Python session into a business process variable in IRIS;

(f) Populate IRIS €“ writes the data transmitted to Azure ML by the sender business process, together with the €œaudit€ results by Azure ML, from the business process variable in IRIS to a table in IRIS;

(g) Audit with IntegratedML €“ €œaudits€ the data received from Azure ML, together with the €œaudit€ results by Azure ML, written into IRIS in the previous action, using IntegratedML accelerator (in this particular case it handles H2O auto-ML framework);

(h) Query to Python €“ transfers the data and the €œaudit€ results by IntegratedML into the Python session;

(i) Interpret IntegratedML €“ in the Python session, creates a visualization of the €œaudit€ results by IntegratedML;

(j) Resource Cleanup €“ deletes from Azure the computational infrastructure created in the previous actions.

8752141894

Figure 6 Visualization of Azure ML €œaudit€ results

8752146061

Figure 7 Visualization of IntegratedML €œaudit€ results

How distributed AI is implemented in general on InterSystems IRIS platform

InterSystems IRIS platform distinguishes among three fundamental approaches to distributed AI implementation:

  • Direct exchange of AI artifacts with their local and central handling based on the rules and algorithms defined by the user
  • AI artifact handling delegated to specialized frameworks (for example: TensorFlow, PyTorch) with exchange orchestration and various preparatory steps configured on local and the central instances of InterSystems IRIS by the user
  • Both AI artifact exchange and their handling done via cloud providers (Azure, AWS, GCP) with local and the central instances just sending input data to a cloud provider and receiving back the end result from it

8752146095

Figure 8 Fundamental approaches to distributed AI implementation on InterSystems IRIS platform

These fundamental approaches can be used modified/combined: in particular, in the concrete scenario described in the previous section of this article (€œaudit€), the third, €œcloud-centric€, approach is used with a split of the €œauditor€ part into a cloud portion and a local portion executed on the data scientist side (acting as a €œcentral instance€).

Theoretical and applied elements that are adding up to the €œdistributed artificial intelligence€ discipline right now in this reality that we are living, have not yet taken a €œcanonical form€, which creates a huge potential for implementation innovations. Our team of experts follows closely the evolution of distributed AI as a discipline, and constructs accelerators for its implementation on InterSystems IRIS platform. We would be glad to share our content and help everyone who finds useful the domain discussed here to start prototyping distributed AI mechanisms.