AI Act is coming. Is your organisation ready?

EU regulation called AI Act is fast approaching and will regulate how artificial intelligence is used in the EU. This blog post looks at what each organisation should be doing to be prepared in time for this.

August 29, 2023

Satu Korhonen ML engineer, who is focused in creating safe, secure, and sustainable ML solutions that bring long-term business value to organisations.

The EU regulation on ethical and robust AI solutions called the AI Act is fast approaching. The idea behind this regulation is to enable the development of artificial intelligence while ensuring that human rights are considered and accounted for in the development. The current estimated timeline is to reach an agreement about the content of the AI Act with all member states by the end of this year and have it in effect soon after. The broad strokes of this regulation are already known. My advice for all organisations is to start preparing for this now.

Two ways to prepare for the AI Act

The first thing is to become educated on what this regulation is and make sure everyone in your organisation creating, developing, and deciding on AI projects is also aware of it. This relates to the reasons behind it, the broad strokes of the regulation itself, and how to prepare for it.

The second aspect focuses on preparing the AI solutions in the development pipeline and in production to comply with the regulation as well as also looking at the development culture in the organisation so new AI solutions can be compliant by design. We at Solita have developed a client-tested tool and framework to help with this.

I will also look more closely at both topics next.

Get educated: how we got here and where’s here?

News of companies running into ethical and legal problems with artificial intelligence has made headlines for years with increasing numbers. In 2016 The COMPAS risk assessment recidivism algorithm used by the U.S. court system was assessed to be very biased. Microsoft unveiled the Tay chatbot only to take it offline within 16 hours due to inflammatory and offensive tweets. In 2018 Amazon shut down their recruitment AI for sexist results and in 2019 Apple credit card ran into problems when it offered smaller lines of credit to women than to men.

These and similar issues as well as the rise in popularity of this technology quickly put this topic on the tables of regulators to regulate AI in a way to make it align with legal, ethical, and safety guidelines in the European Union.

The rise in popularity of generative AI, with f.ex. Midjourney and ChatGPT, has required discussions on the content and wording of the AI Act late in this process.

In June 2018 the European Commission established an expert group to create Ethics Guidelines for Trustworthy Artificial Intelligence. The group emphasized three key components for the development of AI. Firstly, AI solutions should be lawful, adhering to existing laws and regulations related to, for instance, product development, data usage, and discrimination. Secondly, AI should be ethical, respecting principles of human autonomy, prevention of harm, fairness, and explicability. Lastly, AI systems should be technically robust, safe, and secure, and consider their social environment. Particular attention should be given to vulnerable and historically disadvantaged groups and situations characterized by power or information imbalances.

In April 2021 the European Commission proposed the first EU regulatory framework for AI. Their priority was to make sure that AI systems used in the EU are safe, fair, transparent, traceable, non-discriminatory, overseen by people, and environmentally friendly. This proposed framework is the basis of the AI Act. Since June 2023, the regulation has transitioned to the phase where EU countries are forming the final wording of the regulation.

General principles of this regulation is that all AI systems are safe, protect privacy, transparent, and non-discriminate, as well as having proper data management.

The plan is to reach an agreement by the end of this year with Generative AI causing plenty of discussions this spring.

The regulation takes a use case-based approach categorizing AI applications into four risk levels, which are low, limited, high and unaccepted risk. Low-risk applications, such as email spam filters and predictive maintenance systems, are not heavily regulated but still encouraged to be robust. Limited risk applications, including chatbots, are subject to transparency obligations. Users must be aware that they are interacting with AI and should have the option to discontinue use while at the same time receiving the service.

Generative AI can be at different risk levels. If used in low-risk application, it needs to be transparent in disclosing AI-generated content for users, prevent the generation of illegal content, and publish summaries of copyrighted data used in training. Currently, the larger foundational models still somewhat lack these aspects as Stanford University discovered earlier this year. The companies providing general-purpose foundational models AIs like GPT have requirements stated if they are to operate within EU.

High-risk applications are such that the system poses a significant risk of harm to the health, safety, or fundamental rights of natural persons or the environment and are subjected to strict obligations throughout their lifecycle. These use cases are defined as AI systems in law enforcement, migration and border control, recommender systems of large online platforms, justice administration and democratic processes, education, employment, essential private and public services, and critical infrastructure. These systems require comprehensive risk assessments and mitigation, data governance practices, transparency and provision of information about it being an AI system. They also require a fundamental rights impact assessment and monitoring environmental impact.

The unaccepted risk and hence prohibited use cases pose clear threats to people’s safety, livelihoods, and rights. Examples include biometric classification of individuals based on sensitive characteristics, real-time remote biometric identification and retrospective identification from recordings, proactive policing activities, emotion recognition systems in f.ex. law enforcement and education, and establishing large-scale facial recognition databases. These will not be allowed within the EU.

Get prepared: finding and fixing potential issues

While education on the regulation is still happening, the review of AI use cases in development and production should take place. This involves all solutions that are being designed, are in development, or are in production. For each, an assessment should be done on their risk level according to the regulation and the priority for business of each solution. This is a fairly simple operation of having current the AI Act guidelines and looking through each use case to which category they are most likely to land in. Legal advice can be sought to offer guidance in cases that could be seen one way or the other. For solutions in production that are already GDPR compliant and not high-risk, the chances are that required chances are small.

After each solution has been classified to a risk level, the characteristics of each solution need to be investigated regarding the AI Act risk assessment classification. We have developed a customer-tested framework and tool for this at Solita. The list of things to look at is very different in a low-risk application to that of a high-risk application. I recommend starting with the business-critical applications first. The idea here is to foresee possible issues with complying with the regulation and react to them in descending order of priority.

One thing each organisation should consider. If you are using an AI system offered from outside the EU, but used inside the EU, it’s advisable to prepare for some interruption of service. Systems offered outside the EU might not all choose to comply with the regulation but will instead opt out and not offer their services inside the EU, or they might take their service offline while building compliance with the new regulation. The more critical the AI system is to the company, the more important it is to take this into account in time.

Review the development and maintenance pipeline

While reviewing each use case is important, it is important also to start building or strengthening processes that ensure compliance with the regulation now and in the future so AI solutions can be compliant by design. Our framework serves this process well. The tool allows a company to look at their AI development processes, as well as each model, to easily document the current ways of working, identify any issues needing further development, and a way to document progress against identified gaps. We are closely following the development of the regulation and developing the framework accordingly.

Build a roadmap and implement it

Once use cases have been identified and the ways of working assessed, the next phase is to develop a roadmap in order of priority to fix any identified issues before they cause problems. Our framework helps with this. In limited to no-risk use cases it is useful to already start building transparency of an AI model being used and start building opt-out methods if not already present. Also, and especially in high-risk use cases, building a method of redress and obtaining explanations of the decisions made by the AI system, is very important.

For any use case that can be considered high risk, a more thorough investigation should be already started that documents carefully the system and its development and provides users and affected parties with the necessary information on the solution. The Solita framework makes this process easier. For the forbidden use cases, if they exist, it is necessary to think about the end of life of these solutions.

Act now

The AI Act is coming, and it is coming soon. I recommend starting to prepare for it now, and, if necessary, seek out ML engineers, responsible AI experts, and designers with thorough knowledge and understanding of this topic area to help to educate staff and build a roadmap for compliance with each AI solution in the development pipeline or in production. When the regulation comes into effect, I suspect many organizations will wake up to the need to do something as they did when GDPR came into effect and consultancies capable of helping them will be fully booked quickly. As the main parts of this regulation are already well known, my advice is to start preparing for this now. We’re always happy to help with client-tested tools to do so.

Oh, by the way, the Data Act, AI liability act, and Cyber Resiliency Act are coming too. Luckily, we’re looking at those too when we build our tools and frameworks.

MLOps on Databricks: Streamlining Machine Learning Workflows

Databricks is a cloud-based platform that seamlessly integrates data engineering, machine learning, and analytics to simplify the process of building, training, and deploying Machine Learning models. With its unified platform built on top of Lakehouse architecture, Databricks empowers Data Scientist and ML engineers to unleash their full potential, providing a collaborative workspace and offering comprehensive tooling that streamline the entire ML process, including tools to support DevOps to model development, deployment and management.

February 14, 2023

Jyoti Prasad Bartaula

MLOps in a nutshell

While many companies and businesses are investing in AI and machine learning to stay competitive and capture the untapped business opportunity, they are not reaping the benefits of those investments as their journey of operationalizing machine learning is stuck as a jupyter notebook level data science project. And that’s where MLOps comes to the rescue.

MLOps is a set of tools and practices for the development of machine learning systems. It aims to enhance the reliability, efficiency, and speed of productionizing machine learning. In the meantime, adhering to governance requirements. MLOps facilitate collaboration among data scientists, ML engineers, and other stakeholders and automate processes for a quicker production cycle of machine learning models. MLOps takes a few pages out of DevOps book; a methodology of modern software development but differs in asset management, as it involves managing source code, data, and machine learning models together for version control and model comparison, as well as for model reproducibility. Therefore, in essence, MLOps involves jointly managing source code (DevOps), data (DataOps) and Machine Learning models (ModelOps), while also continuously monitoring both the software system and the machine learning models to detect performance degradation.

MLOps = DevOps + DataOps + ModelOps

MLOps on Databricks

Recently, I had a chance to test and try out the Databricks platform. And in this blog post, I will attempt to summarise what Databricks has to offer in terms of MLOps capability.

First of all, what is Databricks ?

Databricks is a web based multi-cloud platform that aims to unify data engineering, machine learning, and analytics solutions under single service. The standalone aspect of Databricks is its LakeHouse architecture that provides data warehousing capabilities to a data lake. As a result, Databricks lakehouse eliminates the data silos due to pushing data into multiple data warehouses or data lakes, thereby providing data teams the single source of data.

Databricks aims to consolidate, streamline and standardise the productionizing machine learning with Databricks Machine Learning service. With MLOps approach built on their Lakehouse architecture, Databricks provides suits of tools to manage the entire ML lifecycle, from data preparation to model deployment.

MLOps approach on Databricks is built on their Lakehouse Platform which involves jointly managing code, data, and models. Fig:Databricks

For the DevOps part of MLOps, Databricks provides capability to integrate various git providers, DataOps uses DeltaLake and for ModelOps they come integrated with MLflow: an open-source machine learning model life cycle management platform.

DevOps

Databricks provides Repos that support git integration from various git providers like Github, Bitbucket, Azure DevOps, AWS CodeCommit and Gitlab and their associated CI/CD tools. Databricks repos also support various git operations such as cloning a repository, committing. and pushing, pulling, branch management, and visual comparison of diffs when committing, helping to sync notebooks and source code with Databricks workspaces.

DataOps

DataOps is built on top of Delta Lake. Databricks manages all types of data (raw data, log, features, prediction, monitoring data etc) related to the ML system with Delta Lake. As the feature table can be written as a Delta table on top of delta lake, every data we write to delta lake is automatically versioned. And as Delta Lake is equipped with time travel capability, we can access any historical version of the data with a version number or a timestamp.

In addition, Databricks also provides this nice feature called Feature Store. Feature Store is a centralised repository for storing, sharing, and discovering features across the team. There are a number of benefits of adding feature stores in machine learning learning development cycle. First, having a centralised feature store brings the consistency in terms of feature input between model training and inference eliminating online/offline skew there by increasing the model accuracy in production. It also eliminates the separate feature engineering pipeline for training and inference reducing the technical dept of the team. As the feature store integrates with other services in Databricks, features are reusable and discoverable to other teams as well; like analytics and BI teams can use the same set of features without needing to recreate them. Databricks’s Feature store also allows for versioning and lineage tracking of features like who created features, what services/models are using them etc thereby making it easier to apply any governance like access control list over them.

ModelOps

ModelOps capability in Databricks is built on a popular open-source framework called MLFlow. MLflow provides various components and apis to track and log machine learning experiments and manage model’s lifecycle stage transition.

Two of the main components of MLFlow are MLFlow tracking and MLFlow model registry.

The MLflow tracking component provides an api to log and query and an intuitive UI to view parameters, metrics, tags, source code version and artefacts related to machine learning experiments where experiment is aggregation of runs and runs are executions of code. This capability to track and query experiments helps in understanding how different models perform and how their performance depends on the input data, hyperparameter etc.

Another core component of MLflow is Model Registry: a collaborative model hub, which let’s manage MLflow models and their lifecycle centrally. Model registry is designed to take a model from model tracking to put it through staging and into production. Model registry manages model versioning, model staging (assign “Staging” and “Production” to represent the lifecycle of a model version), model lineage (which MLflow Experiment and Run produced the model) and model annotation (e.g. tags and comments). Model registry provides webhooks and api to integrate with continuous delivery systems.

The MLflow Model Registry enables versioning of a single corresponding registered model where we can seamlessly perform stage transitions of those versioned models.

Databricks also supports the deployment of Model Registry’s production model in multiple modes: batch and streaming jobs or as a low latency REST API, making it easy to meet the specific requirements of an organisation.

For model monitoring, Databricks allows logging the input queries and predictions of any deployed model to Delta tables.

Conclusion

MLOps is a relatively nascent field and there are a myriad of tools and MLOps platforms out there to choose from. Apples to apples comparison of those platforms might be difficult as the best MLOps tool for one case might differ to another case. After all, choosing the fitting MLOps tools highly depends on various factors like business need, current setup, available resources at disposal etc.

However, with the experience of using a few other platforms, personally, I find Databricks the most comprehensive platform of all. I believe Databricks make it easy for organisations to streamline their ML operations at scale. Platform’s collaboration and sharing capabilities should make it easy for teams to work together on data projects using multiple technologies in parallel. One particular tool which I found pleasing to work with is Databricks notebook. It is a code development tool, which supports multiple programming languages (R, SQL, Python, Scala ) in a single notebook, while also supporting real time co-editing and commenting. In addition, as the entire project can be version controlled by a tool of choice and integrates very well with their associated CI/CD tools, it adds flexibility to manage, automate and execute the different pipelines.

To sum up, Databricks strength lies in its collaborative, comprehensive and integrated environment for running any kind of data loads whether it is data engineering, data science or machine learning on top of their Lakehouse architecture. While many cloud based tools come tightly coupled with their cloud services, Databricks is cloud agnostic making it easy to set up if one’s enterprise is already running on a major cloud provider (AWS, Azure or Google cloud).

Finally, if you would like to hear more about Databricks as an unified Analytics, Data, and Machine Learning platform and learn how to leverage Databricks services in your Data journey, please don’t hesitate to contact me our Business Lead – Data Science, AI & Analytics, Mikael Ruohonen at +358414516808 or mikael.ruohonen@solita.fi or me at jyotiprasad.bartaula@solita.fi.

Introduction to Edge AI with HPE Ezmeral Data Fabric

In this blog, we will be talking about how technology has shifted from on-premises data centers to the cloud and from cloud to edge. Then, we will explain data fabric, introduce HPE Ezmeral Data Fabric and investigate its capabilities. Finally, we will talk about Edge AI with HPE Ezmeral Data Fabric.

September 15, 2022

Sadaf Nazari

To see what Edge AI is, we need to take a deeper look at the history of data processing over time.

The evolutions of data-intensive workloads

On-premises data centers

Back in 2000, almost everything was running locally in on-premises data centers. This means that everything from management to maintenance was on the company’s shoulders. It was fine but over time, when everything was getting more dependent on the internet, businesses faced some challenges. Here are some of the most important ones:

Infrastructure inflexibility

Over time, many new services and technologies are released and it should be taken into consideration that there might be a need to update the infrastructure or apply some changes to the services.

This can be challenging when it comes to hardware changes. The only solution seems to be purchasing the desirable hardware, then manual configuration. It can be worse if, at some point, we realize that the new changes are not beneficial. In this case, we have to start all over again!

This inflexibility causes wasting money and energy.

How about scaling on demand

A good business invests a lot of money to satisfy its customers. It can be seen from different angles but one of the most important ones always has the capacity to respond to the clients as soon as possible. This rule is also applied to the digital world: even loyal customers might change their minds if they see that the servers are not responding due to reaching their maximum capacity.

Therefore, there should be an estimation of the demand. The challenging part of this estimation is when this demand goes very high on some days during the year and one should forecast it. This demand forecasting has many aspects and it is not limited to the digital traffic from clients to servers. Having a good estimation of the demand for a particular item in the inventory is highly valuable.

Black Friday is a good example of such a situation.

There are two ways to cope with this unusual high demand:

Purchase extra hardware to ensure that there will be no delay in responding to the customers’ requests. This strategy seems to be safe, but it has some disadvantages. First, since the demand is high on only certain days, many resources are in idle mode for a long time. Second, the manual configuration of the newly purchased devices should be considered. All in all, it is not a wise decision financially.
Ignore that demand and let customer experience the downtime and wait for servers to become available. As it is easy to guess, it is not good for the reputation of the business.

This inflexibility is hard to address, and it gets worst over time.

Expansion

One might want to expand the business geographically. Along with marketing, there are some technical challenges.

The issue with the geographical expansion is the delay that is caused by the physical distance between the clients and servers. A good strategy is to distribute the data centers around the world and locate them somewhere closer to the customers.

The configuration of these new data centers along with the security, networking, and data management might be very hard.

Cloud Computing

Having the challenges of the on-premises data centers, the first evolution of data-intensive workloads happened around 2010 when third-party cloud providers such as Amazon Web Services and Microsoft Azure were introduced.

They provided companies with the infrastructure/services with the pay-as-you-go approach.

Cloud Computing solved many problems with on-premises approaches.

Risto and Timo have a great blog post about “Cloud Data Transformation” and I recommend checking it out to know more about the advantages of Cloud Computing.

Edge Computing

Over time, more applications have been developed, and Cloud Computing seemed to be the proper solution for them, but around 2020 Edge Computing got more and more attention as the solution for a group of newly-introduced applications that were more challenging.

The common feature of these applications was being time-sensitive. Cloud computing might act poorly in such cases since the data transmission to the cloud is time-consuming itself.

The basic idea of Edge Computing is to process data close to where it is produced. This decentralization has some benefits such as:

Reducing latency

As discussed earlier, the main advantage of Edge Computing is that it reduces the latency by eliminating the data transmission between its source and cloud.

Saving Network Bandwidth

Since the data is being processed in Edge Nodes, the network bandwidth can be saved. This matters a lot when the stream of data needs to be processed.

Privacy-preserving

Another essential advantage of Edge Computing is that the data does not need to leave its source. Therefore, it can be used in some applications where sending data to the cloud/on-perm data centers is not aligned with regulations.

AI applications

Many real-world use cases in the industry were introduced along with the advances in Artificial Intelligence.

There are two options for deploying the models: Cloud-based AI and Edge AI. There is also another categorization for training the model (centralized and decentralized) but it is beyond the scope of this blog.

Cloud-based AI

With this approach, everything happens in the cloud, from data gathering to training and deploying the model.

Cloud-based AI has many advantages, such as being cost-saving. It would be much cheaper to use cloud infrastructure for training a model rather than purchasing the physical GPU-enabled computers.

The workflow of such an application is that after the model is deployed, new unseen data from the business unit (or wherever the source of data is) will be sent to the cloud, the decision will be made there and it will be sent back to the business unit.

Edge AI

As you might have guessed, Edge AI addresses the time-sensitivity issue. This time, the data gathering and training of the model steps still happen in the cloud, but the model will be deployed on the edge nodes. This change in the workflow not only saves the network bandwidth but also reduces the latency.

Edge AI opens the doors to many real-time AI-driven applications in the industry. Here are some examples:

Autonomous Vehicles
Traffic Management Systems
Healthcare systems
Digital Twins

Data Fabric

So far, we have discussed a bit about the concepts of Cloud/Edge computing, but as always, the story is different in real-world applications.

We talked about the benefits of cloud computing but it is important to ask these questions ourselves:

What would be the architecture of having such services in the Cloud/Edge?
What is the process of migration from on-prem to cloud? What are the challenges? How can we solve them?
How can we manage and access data in a unified manner to avoid data silos?
How can we orchestrate distributed servers or edge nodes in an optimized and secure way?
How about monitoring and visualization?

Many companies came up with their own solutions for the above questions with manual work but there is a need for a better way for a business to focus on creating values, rather than dealing with these issues. This is when Data Fabric comes into the game.

Data Fabric is an approach for managing data in an organization. Its architecture consists of a set of services that make accessing data easier regardless of its location (on-prem, cloud, edge). This architecture is flexible, secure, and adaptive.

Data Fabric can reduce the integration time, the maintenance time, and the deployment time.

Next, we will be talking about the HPE Ezmeral Data Fabric (Data Fabric is offered as a solution by many enterprises and the comparison between them is beyond the scope of this blog).

HPE Ezmeral Data Fabric

HPE Ezmeral Data Fabric is an Edge to Cloud solution that supports industry-standard APIs such as REST, S3, POSIX, and NFS. It also has an ecosystem package that contains many open-source tools such as Apache Spark and allows you to do data analysis.

You can find more information about the benefits of using HPE Ezmeral Data Fabric here.

As you can see, there is an eye-catching part named “Data Fabric Event Stream”. This is the key feature that allows us to develop Edge AI applications with the HPE Ezmeral Data Fabric.

Edge AI with HPE Ezmeral Data Fabric – application

An Edge AI application should contain at least one platform for orchestrating the broker cluster such as Kafka, some tools such as Apache Spark, and a data store. This might not be as easy as it seems, especially in large-scale applications when we have millions of sensors, thousands of edge sites, and the cloud.

Fortunately, with HPE Ezmeral Data Fabric Event Stream, this task can be done much easier. We will go through it by demonstrating a simple application that we developed.

Once you set up the cluster, the only thing you need to do is to install the client on the edge nodes, connect them to the cluster (by a simple line maprlogin command), and then enable the services that you want to use.

For the event stream, it is already there, and again it just needs a single command for creating a stream and then creating topics in it.

For the publisher (also called producer), you need to just send the data from any source to the broker, and for the subscriber (also called consumer) the story is the same.

For using open-source tools such as Apache Spark (or in our case Spark Structure Streaming), you just need to install them on the mapr client, and the connection between the client and the cluster will be automatically established. So you can run a script in edge nodes and access data in the cluster.

Storing data is again as simple as the previous ones. The table creation can be done with a single command, and storing it is also straightforward.

Conclusion

To sum up, Edge AI has a promising future, and leveraging it with different tools such as Data Fabric can be a game changer.

Thank you for reading this blog! I would also like to invite you to our talk about the benefits of Edge Computing in Pori on 23/09/2022!

More information can be found here.

Sadaf Nazari.

Your AI partner can make or break you!

Industries have resorted to use AI partner services to fuel their AI aspirations and quickly bring their product and services to market. Choosing the right partner is challenging and this blog lists a few pointers that industries can utilize in their decision making process.

May 11, 2022

Karthik Sindhya

Large investments in AI clearly indicate industries have embraced the value of AI. Such a high AI adoption rate has induced a severe lack of talented data scientists, data engineers and machine learning engineers. Moreover, with the availability of alternative options, high paying jobs and numerous benefits, it is clearly an employee’s market.

Market has a plethora of AI consulting companies ready to fill in the role of AI partners with leading industries. Among such companies, on one end are the traditional IT services companies, who have evolved to provide AI services and on the other end are the AI start-up companies who have backgrounds from academia with a research focus striving to deliver the top specialists to industries.

Considering that a company is willing to venture into AI with an AI partner. In this blog I shall enumerate what are the essentials that one can look for before deciding to pick their preferred AI partner.

AI knowledge and experience: AI is evolving fast with new technologies developed by both industries and academia. Use cases in AI also span multiple areas within a single company. Most cases usually fall in following domains: Computer vision, Computer audition, Natural language processing, Interpersonally intelligent machines, routing, and motion and robotics. It is natural to look for AI partners with specialists in the above areas.

It is worth remembering that most AI use cases do not require AI specialists or super specialists and generalists with wide AI experience could well handle the cases.

Also specialising in AI alone does not suffice to successfully bring the case to production. The art of handling industrial AI use cases is not trivial and novice AI specialists and those that are freshly out of University need oversight. Hence companies have to be careful with such AI specialists with only academic experience or little industrial experience.

Domain experience: Many AI techniques are applicable across cases in multiple domains. Hence it is not always necessary to seek such consultants with domain expertise and often it is an overkill with additional expert costs. Additionally, too much domain knowledge can also restrict our thinking in some ways. However, there are exceptions when domain knowledge might be helpful, especially when limited data are available.

A domain agnostic AI consultant can create and deliver AI models in multiple domains in collaboration with company domain experts.

Thus making them available for such projects would be important for the company.

Problem solving approach This is probably the most important attribute when evaluating an AI partner. Company cases can be categorised in one of the following silo’s:

Open sea: Though uncommon, it is possible to see few such scenarios, when the companies are at an early stage of their AI strategy. This is attractive for many AI consultants who have the freedom to carve out an AI strategy and succeeding steps to boost the AI capabilities for their clients. With such freedom comes great responsibility and AI partners for such scenarios must be carefully chosen who have a long standing position within the industry as a trusted partner.
Straits: This is most common when the use case is at least coarsely defined and suitable ML technologies are to be chosen and take the AI use case to production. Such cases often don’t need high grade AI researchers/scientists but any generalist data scientist and engineer with the experience of working in an agile way can be a perfect match.
Stormy seas: This is possibly the hardest case, where the use case is not clearly defined and also no ready solution is available. The use case definition is easy to be defined with data and AI strategists, but research and development of new technologies requires AI specialists/scientists. Hence special emphasis should be focused on checking the presence of such specialists. It is worth noting that AI specialists availability alone does not guarantee that there is a guaranteed conversion to production.

Data security: Data is the fuel for growth for many companies. It is quite natural that companies are extremely careful with safeguarding the data and their use. Thus when choosing an AI partner it is important to look and ask for data security measures that are currently practised with the AI partner candidate organisation. In my experience it is quite common that AI specialists do not have data security training. If the company does not emphasise on ethics and security the data is most likely stored by partners all over the internet, (i.e. personal dropbox and onedrive accounts) including their private laptops.

Data platform skills: AI technologies are usually built on data. It is quite common that companies have multiple databases and do not have a clear data strategy. AI partners with inbuilt experience in data engineering shall go well, else a separate partner would be needed.

Design thinking: Design thinking is rarely considered a priority expertise when it comes to AI partnering and development. However this is probably the hidden gem beyond every successful deployment of AI use case. AI design thinking adopts a human centric approach, where the user is at the centre of the entire development process and her/his wishes are the most important. The adoption of the AI products would significantly increase when the users problems are accounted for, including AI ethics.

Blowed marketing: Usually AI partner marketing slides boast about successful AI projects. Companies must be careful interpreting this number, as often major portions of these projects are just proof of concepts which have not seen the light of day for various reasons. Companies should ask for the percentage of those projects that have entered into production or at least entered a minimum viable product stage.

Above we highlight a few points that one must look for in an AI partner, however what is important over all the above is the market perception of the candidate partner, and as a buyer you believe there is a culture fit, they understand your values, terms of cooperation, and their ability to co-define the value proposition of the AI case. Also AI consultants should stand up for their choices and not shy away from pointing to the infeasibility and lack of technologies/data to achieve desired goals set for AI use cases fearing the collapse of their sales.

Finding the right partner is not that difficult, if you wish to understand Solita’s position on the above pointers and looking for an AI partner don’t hesitate to contact us.

Author: Karthik Sindhya, PhD, AI strategist, Data Science, AI & Analytics,
Tel. +358 40 5020418, karthik.sindhya@solita.fi

How to choose your next machine learning project

Three steps to be intentionally agnostic about tools. Reduce technical debt, increase stakeholder trust and make the objective clear. Build a machine learning system because it adds value, not because it is a hammer to problems.

March 9, 2022

Philipp Schmalen

As data enthusiasts we love to talk, read and hear about machine learning. It certainly delivers value to some businesses. However, it is worth taking a step back. Do we treat machine learning as a hammer to problems? Maybe a simple heuristic does the job with substantially lower technical debt than a machine learning system.

Do machine learning like the great engineer you are, not like the great machine learning expert you aren’t.

Google developers. Rules of ML.

In this article, I look at a structured approach to choose the next data science project that aligns to business goals. It combines objective key results (OKR), value-feasibility and other suggestions to stay focused. It is especially useful for data science leads, business intelligence leads or data consultants.

Why data science projects require a structured approach

ML solves complex problems with data that has a predictive signal for the problem at hand. It does not create value by itself.

So, we love to talk about Machine learning (ML) and artificial intelligence (AI). On the one hand, decision makers get excited and make it a goal: “We need to have AI & ML”. On the other hand, the same goes for data scientists who claim: “We need to use a state-of-the-art method”. Being excited about technology has its upsides, but it is worth taking a step back for two reasons.

Choosing a complex solution without defining a goal creates more issues than it solves. Keep it simple, minimize technical debt. Make it easy for a future person to maintain it, because that person might be you.
A method without a clear goal fails to create business value and erodes trust. Beyond the hype around machine learning, we do data science to create business value. Ignoring this lets executives reduce funding for the next data project.

This is nothing new. But, it does not hurt to be reminded of it. If I read about an exciting method, I want to learn and apply it right away. What is great for personal development, might not be great for the business. Instead, start with what before thinking about how.

In the next section, I give some practical advice on how to structure the journey towards your next data project. The approach helps me to focus on what is next up for the business to solve instead of what ML method is in the news.

How to choose the next data science project

“Rule #1: Don’t be afraid to launch a product without machine learning.”

Google developers. Rules of ML.

Imagine you draft the next data science cases at your company. What project to choose next? Here are three steps to structure the journey.

Step 1: Write data science project cards

The data science project card helps to focus on business value and lets you be intentionally agnostic about methodologies in the early stage

Summarize each idea in a data science project card which includes some kind of OKR, data requirements, value-feasibility and possible extensions. It covers five parts which contain all you need to structure project ideas, namely an objective (what), its key results (how), ideal and available data (needs), the value-feasibility diagram (impact) and possible extension. What works for me is to imagine the end-product/solution to a business need/problem before I put it into a project card.

Find the project card templates as markdown or powerpoint slides.

I summarize the data science project in five parts.

An objective addresses a specific problem that links to a strategic goal/mission/vision, for example: “Enable data-driven marketing to get ahead of competitors”, “Automate fraud detection for affiliate programs to make marketing focusing on core tasks” or “Build automated monthly demand forecast to safeguard company expansion”.
Key results list measurable outcomes that mark progress towards achieving the objective, for example: “80% of marketing team use a dashboard daily”, “Cover 75% of affiliate fraud compared to previous 3 month average” or “Cut ‘out-of-stock’ warnings by 50%, compared to previous year average”.
Data describes properties of the ideal or available dataset, for example: “Transaction-level data of the last 2 years with details, such as timestamp, ip and user agent” or “Product-level sales including metadata, such as location, store details, receipt id or customer id”.
Extensions explores follow-up projects, for example: “Apply demand forecast to other product categories” or “Take insights from basket analysis to inform procurement.”
The value-feasibility diagram puts the project into a business perspective by visualizing value, feasibility and uncertainties around it. The smaller the area, the more certain is the project’s value or feasibility.

To provide details, I describe a practical example how I use these parts for exploring data science projects. The journey starts by meeting the marketing team to hear about their work, needs and challenges. If a need can be addressed with data, they become the end-users and project target group. Already here, I try to sketch the outcome and ask the team about how valuable it is which estimates the value.

Next, I take the company’s strategic goals and formulate an objective that links to them following OKR principles. This aligns the project with mid-term business goals, makes it part of the strategy and increases buy-in from top-level managers. Then I get back to the marketing team to define key results that let us reach the objective.

A draft of an ideal dataset gets compared to what is available with data owners or the marketing team itself. That helps to get a sense for feasibility. If I am uncertain about value and feasibility, I increase the area in the diagram. It is less about being precise, but about being able to compare projects with each other.

Step 2: Sort projects along value and feasibility

Value-feasibility helps to prioritize projects, takes a business perspective and increases stakeholder buy-in.

Ranking each project along value and feasibility makes it easier to see which one to prioritize. The areas visualize uncertainties on value and feasibility. The larger they stretch along an axis, the less certain I am about either value or feasibility. If they are more dot-shaped, I am confident about a project’s value and its feasibility.

Projects with their estimated value and feasibility

Note that some frameworks evaluate adaptation and desirability separately to value and feasibility. But you get low value when you score low on either adaptation or desirability. So, I estimate the value with business value, adaptation and desirability in my mind without explicitly mentioning it.

Data science projects tend to be long-term with low feasibility today and uncertain, but potentially high future value. Breaking down visionary, less feasible projects into parts that add value in themselves could produce a data science roadmap. For example, project C which has uncertain value and not feasible as of today, requires project B to be completed. Still, the valuable and feasible project A should be prioritized now. Thereafter, aim for B on your way to C. Overall, this overview helps to link projects and build a mid-term data science roadmap.

Related data science projects combined to a roadmap

Here is an example of a roadmap that starts with descriptive data science cases and progresses towards more advanced analytics such as forecasting. That gives a prioritization and helps to draft a budget.

Step 3: Iterate around the objective, method, data and value-feasibility

Be intentionally agnostic about the method first, then opt for the simplest one, check the data and implement. Fail fast, log rigorously and aim for the key results.

Implementing data science projects has so many degrees of freedom that it is beyond the scope of this article to provide an exhaustive review. Nevertheless, I collected some statements that can help through the project.

Don’t be afraid to launch a product without machine learning. And do machine learning like the great engineer you are, not like the great machine learning expert you aren’t. (Google developers. Rules of ML.)
Focus on few customers with general properties instead of specific use cases (Zhenzhong Xu, 2022. The four innovation phases of Netflix’ trillions scale real-time data infrastructure.)
Keep the first model simple and get the infrastructure right. Any heuristic or model that gives quick feedback suits at early project stages. For example, start with linear regression or a heuristic that predicts the majority class for imbalanced datasets. Build and test the infrastructure around those components and replace them when the surrounding pipelines work (Google developers. Rules of ML. Mark Tenenholtz, 2022. 6 steps to train a model.)
Hold the model fixed and iteratively improve the data. Embrace a data-centric view where data consistency is paramount. This means, reduce the noise in your labels and features such that an existing predictive signal gets carved out for any model (Andrew Ng, 2021. MLOps: From model-centric to data-centric AI).
Each added component also adds a potential for failure. Therefore, expect failures and log any moving part in your system.
Test your evaluation metric and ensure you understand what “good” looks like (Raschka, 2020. Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning.)

There are many more best practices to follow and they might work differently for each of us. I am curious to hear yours!

Conclusion

In this article, I outlined a structured approach for data science projects. It helps me to channel efforts into projects that fit business goals and choose appropriate methods. Applying complex methods like machine learning independent of business goals risks accruing technical debt and at worst jeopardizes investments.

I propose three steps to take action:

Write a project card that summarizes the objective of a data science case and employs goal-setting tools like OKR to engage business-oriented stakeholders.
Sort projects along value and feasibility to reasonably prioritize.
Iterate around the objective, method, data and value-feasibility and follow some guiding industry principles that emerged over the last years.

The goal is to translate data science use cases into something more tangible, bridging the gap between business and tech. I hope that these techniques empower you for your next journey in data science.

Happy to hear your thoughts!

Materials for download

Download the data science project template, structure and generic roadmap as Power Point slides here. You can also find a markdown of a project template here.

Microchips and fleet management

The ultimate duo for smart product at scale

February 7, 2022

Risto Saari Solution Architect

We have seen how cloud based manufacturing has taken a huge step forward and you can find insights listed in our blog post The Industrial Revolution 6.0. Cloud based manufacturing is already here and extends IoT to the production floor. You could define a connected factory as a manufacturing facility that uses digital technology to allow seamless sharing of information between people, machines, and sensors.

if you haven’t read it yet there is great article Globalisation and digitalisation converge to transform the industrial landscape.

There is still much more than factories. Looking around you will notice a lot of smart products such as smart TVs, elevators, traffic light control systems, fitness trackers, smart waste bins and electric bikes. In order to control and monitor the fleet of devices we need rock solid fleet management capabilities that we will cover in another blog post.

This movement towards digital technologies, autonomous systems and robotics will require the most advanced semiconductors to come up with even more high-performance, low power consumption, low-cost, microcontrollers in order to carry complicated actions and operations at Edge. Rise in the Internet of Things and growing demand for automation across end-user industries is fueling growth in the global microcontroller market.

As Software has eaten the world and every product is a data product there will only be SaaS Companies.

Devices at the field must be robust to connectivity issues, in some cases withdraw -30 ~ 70°C operating temperatures, build on resilience and be able to work in isolation most of the time. Data is secured on device, it stays there and only relevant information is ingested to other systems. Machine-to-machine is a crucial part of the solutions and it’s nothing new like explained in blog post M2M has been here for decades.

Microchip powered smart products

Very fine example of world class engineering is Oura Ring. On this scale it’s typical to have Dual-core arm-processor: ARM Cortex based ultra low power MCU with limited memory to store data up to 6 weeks. Even at this size it’s packed with sensors such as infrared PPG (Photoplethysmography) sensor, body temperature sensor, 3D accelerometer and gyroscope.

Smart watches are using e.g. Exynos W920, a wearable processor made with the 5nm node, will pack two Arm Cortex-A55 cores and an Arm Mali-G68 GPU. Even on this small size it includes 4G LTE modem and a GNSS L1 sensor to track speed, distance, and elevation when watch wearers are outdoors.

Taking a mobile phone from your pocket it can be powered by the Qualcomm Snapdragon 888 capable of producing 1.8 – 3 GHz 8 cores with 3 MB Cortex-X1.

Another example is Tesla famous of having Self-Driving Chip for autonomous driving chip designed by Tesla the FSD Chip incorporates 3 quad-core Cortex-A72 clusters for a total of 12 CPUs operating at 2.2 GHz, a Mali G71 MP12 GPU operating 1 GHz, 2 neural processing units operating at 2 GHz, and various other hardware accelerators. infotainment systems can be built on the seriously powerful AMD Ryzen APU powered by RDNA2 graphics so you play The Witcher 3 and Cyberpunk 2077 when waiting inside of your car.

Artificial Intelligence – where machines are smarter

Just a few years ago, to be able to execute machine learning models at Edge on a fleet of devices was a tricky job due to lack of processing power, hardware restrictions and just pure amount of software work to be done. Very often the imitation is the amount of flash and ram available to store more complex models on a particular device. Running AI algorithms locally on a hardware device using edge computing where the AI algorithms are based on the data that are created on the device without requiring any connection is a clear bonus. This allows you to process data with the device in less than a few milliseconds which gives you real-time information.

Figure 1. Illustrative comparison how many ‘cycles’ a microprocessor can do (MHz)

The pure power of computing power is always a factor of many things like the Apple M1 demonstrated how to make it much cheaper and still gain the same performance compared to other choices. So far, it’s the most powerful mobile CPU in existence so long as your software runs natively on its ARM-based architecture. Depending on the AI application and device category, there are various hardware options for performing AI edge processing like CPUs, GPUs, ASICs, FPGAs and SoC accelerators.

Price range for microcontroller board with flexible digital interfaces will start around 4$ with very limited ML cabalities . Nowadays mobile phones are actually very powerful to run heavy compute operations thanks to purpose designed super boosted microchips.

GPU-Accelerated Cloud Services

Amazon Elastic Cloud Compute (EC2) is a great example where P4d instances AWS is paving the way for another bold decade of accelerated computing powered with the latest NVIDIA A100 Tensor Core GPU. The p4d comes with dual socket Intel Cascade Lake 8275CL processors totaling 96 vCPUs at 3.0 GHz with 1.1 TB of RAM and 8 TB of NVMe local storage. P4d also comes with 8 x 40 GB NVIDIA Tesla A100 GPUs with NVSwitch and 400 Gbps Elastic Fabric Adapter (EFA) enabled networking. In practice this means you do not have to take coffee breaks so much and wait for nothing when executing Machine Learning (ML), High Performance Computing (HPC), and analytics. You can find more on P4d from AWS.

Top 3 benefits of using Edge for computing

There are clear benefits why you should be aware of Edge computing:

1. Reduced costs where costs for data communication and bandwidth costs will be reduced as fewer data will be transmitted.

2. Improved security when you are processing data locally, the problem can be avoided with streaming without uploading a lot of data to the cloud.

3. Highly responsive where devices are able to process data really fast compared to centralized IoT models.

Convergence of AI and Industrial IoT Solutions

According to a Gartner report, “By 2027, machine learning in the form of deep learning will be included in over 65 percent of edge use cases, up from less than 10 percent in 2021.” Typically these solutions have not fallen into Enterprise IT – at least not yet. It’s expected Edge Management becomes an IT focus by utilizing IT resources to optimize cost.

Take a look on Solita AI Masterclass for Executives how we can help you to bring business cases in life and you might be interested taking control of your fleet with our kickstart. Let’s stay fresh minded !

MLOps: from data scientist’s computer to production

MLOps refers to the concept of automating the lifecycle of machine learning models from data preparation and model building to production deployment and maintenance. MLOps is not only some machine learning platform or technology, but instead it requires an entire change in the mindset of developing machine learning models towards best practises of software development. In this blog post we introduce this concept and its benefits for anyone having or planning to have machine learning models running in production.

May 8, 2020

Miia Niemelä

Anniina Sallinen

Operationalizing data platforms, DataOps, has been among the hottest topics during the past few years. Recently, also MLOps has become one of the hottest topics in the field of data science and machine learning. Building operational data platforms has made data available for analytics purposes and enabled development of machine learning models in a completely new scale. While development of machine learning models has expanded, the processes of maintaining and managing the models have not followed in the same pace. This is where the concept of MLOps becomes relevant.

What is MLOps?

Machine learning operations, or MLOps, is a similar concept as DevOps (or DataOps), but specifically tailored to needs of data science and more specifically machine learning. DevOps was introduced to software development over a decade ago. DevOps practices aim to improve application delivery by combining the entire life cycle of the application – development, testing and delivery – to one process, instead of having a separate development team handing over the developed solution for the operations team to deploy. The definite benefits of DevOps are shorter development cycles, increased deployment velocity, and dependable releases.

Similarly as DevOps aims to improve application delivery, MLOps aims to productionalize machine learning models in a simple and automated way.

As for any software service running in production, automating the build and deployment of ML models is equally important. Additionally, machine learning models benefit from versioning and monitoring, and the ability to retrain and deploy new versions of the model, not only to be more reliable when data is updated but also from the transparency and AI ethics perspective.

Why do you need MLOps?

Data scientists’ work is research and development, and requires essentially skills from statistics and mathematics, as well as programming. It is iterative work of building and training to generate various models. Many teams have data scientists who can build state-of-the-art models, but their process for building and deploying those models can be entirely manual. It might happen locally, on a personal laptop with copies of data and the end product might be a csv file or powerpoint slides. These types of experiments don’t usually create much business value if they never go live to production. And that’s where data scientists in many cases struggle the most, since engineering and operations skills are not often data scientists’ core competences.

In the best case scenario in this type of development the model ends up in production by a data scientist handing over the trained model artifacts to the ops team to deploy, whereas the ops team might lack knowledge on how to best integrate machine learning models into their existing systems. After deployment, the model’s predictions and actions might not be tracked, and model performance degradation and other model behavioral drifts can not be detected. In the best case scenario your data scientist monitors model performance manually and manually retrains the model with new data, with always a manual handover again in deployment.

The described process might work for a short time when you only have a few models and a few data scientists, but it is not scalable in the long term. The disconnection between development and operations is what DevOps originally was developed to solve, and the lack of monitoring and re-deployment is where MLOps comes in.

How can MLOps help?

Instead of going back-and-forth between the data scientists and operations team, by integrating MLOps into the development process one could enable quicker cycles of deployment and optimization of algorithms, without always requiring a huge effort when adding new algorithms to production or updating existing ones.

MLOps can be divided into multiple practices: automated infrastructure building, versioning important parts of data science experiments and models, deployments (packaging, continuous integration and continuous delivery), security and monitoring.

Versioning

In software development projects it is typical that source code, its configurations and also infrastructure code are versioned. Tracking and controlling changes to the code enables roll-backs to previous versions in case of failures and helps developers to understand the evolution of the solution. In data science projects source code and infrastructure are important to version as well, but in addition to them, there are other parts that need to be versioned, too.

Typically a data scientist runs training jobs multiple times with different setups. For example hyperparameters and used features may vary between different runs and they affect the accuracy of the model. If the information about training data, hyperparameters, model itself and model accuracy with different combinations are not saved anywhere it might be hard to compare the models and choose the best one to deploy to production.

Templates and shared libraries

Data scientists might lack knowledge on infrastructure development or networking, but if there is a ready template and framework, they only need to adapt the steps of a process. Templating and using shared libraries frees time from data scientists so they can focus on their core expertise.

Existing templates and shared libraries that abstract underlying infrastructure, platforms and databases, will speed up building new machine learning models but will also help in on-boarding any new data scientists.

Project templates can automate the creation of infrastructure that is needed for running the preprocessing or training code. When for example building infrastructure is automated with Infrastructure as a code, it is easier to build different environments and be sure they’re similar. This usually means also infrastructure security practices are automated and they don’t vary from project to project.

Templates can also have scripts for packaging and deploying code. When the libraries used are mostly the same in different projects, those scripts very rarely need to be changed and data scientists don’t have to write them separately for every project.

Shared libraries mean less duplicate code and smaller chance of bugs in repeating tasks. They can also hide details about the database and platform from data scientists, when they can use ready made functions for, for instance, reading from and writing to database or saving the model. Versioning can be written into shared libraries and functions as well, which means it’s not up to the data scientist to remember which things need to be versioned.

Deployment pipeline

When deploying either a more traditional software solution or ML solution, the steps in the process are highly repetitive, but also error-prone. An automated deployment pipeline in CI/CD service can take care of packaging the code, running automated tests and deployment of the package to a selected environment. This will not only reduce the risk of errors in deployment but also free time from the deployment tasks to actual development work.

Tests are needed in deployment of machine learning models as in any software, including typical unit and integration tests of the system. In addition to those, you need to validate data and the model, and evaluate the quality of the trained model. Adding the necessary validation creates a bit more complexity and requires automation of steps that are manually done before deployment by data scientists to train and validate new models. You might need to deploy a multi-step pipeline to automatically retrain and deploy models, depending on your solution.

Monitoring

After the model is deployed to production some people might think it remains functional and decays like any traditional software system. In fact, machine learning models can decay in more ways than traditional software systems. In addition to monitoring the performance of the system, the performance of models themselves needs to be monitored as well. Because machine learning models make assumptions of real-world based on the data used for training the models, when the surrounding world changes, accuracy of the model may decrease. This is especially true for the models that try to model human behavior. Decreasing model accuracy means that the model needs to be retrained to reflect the surrounding world better and with monitoring the retraining is not done too seldom or often. By tracking summary statistics of your data and monitoring the performance of your model, you can send notifications or roll back when values deviate from the expectations made in the time of last model training.

Applying MLOps

Bringing MLOps thinking to the machine learning model development enables you to actually get your models to production if you are not there yet, makes your deployment cycles faster and more reliable, reduces manual effort and errors, and frees time from your data scientists from tasks that are not their core competences to actual model development work. Cloud providers (such as AWS, Azure or GCP) are especially good places to start implementing MLOps in small steps, with ready made software components you can use. Moreover, all the CPU / GPU that is needed for model training with pay as you go model.

If the maturity of your AI journey is still in early phase (PoCs don’t need heavy processes like this), robust development framework and pipeline infra might not be the highest priority. However, any effort invested in automating the development process from the early phase will pay back later and reduce the machine learning technical debt in the long run. Start small and change the way you develop ML models towards MLOps by at least moving the development work on top of version control, and automating the steps for retraining and deployment.

DevOps was born as a reaction to systematic organization needed around rapidly expanding software development, and now the same problems are faced in the field of machine learning. Take the needed steps towards MLOps, like done successfully with DevOps before.

Recapping AI related risks to Organizations

When they develop predictive models for business, Data Scientists often feel pressure to create results within a very short time span. These feelings may indicate a larger problem with risk management.

February 14, 2020

Paavo Toivanen

With uncertainty, the natural thing is to divest, i.e. not invest large sums in an uncertain endeavour. But AI risks are not easily disposed of in small projects either.

This might leave organizations perplexed as to what to do. On one hand, there is the call to embrace AI. On the other, the risks are real.

As a rule of thumb, a longer time perspective won’t hurt. Predictive modeling and automation are long-running investments. As such, they should be subject to risk assessment and scrutiny. There should be management for their entire life span.

Because of AI solutions’ partly speculative nature, their risk of failure is relatively high. A recent study underlined this, suggesting that roughly four out of five AI projects fail in the real world.

A predictive model has its particular strengths and weaknesses. But it has some recurring costs too, both implicit and explicit. Some of these costs may fall immediately to the supporting organization. And some of them might even fall outside of it.

The following (otherwise unrelated) tweet from a couple of days back pinpoints these risks neatly.

Leaving aside the social discourse, I very much agree on observations about organizations. There is a certain mindset about DS magically fixing business perspectives and organizational shortcomings. In my personal opinion, this is naïve at best. It is not an overstatement to call it dangerous in some cases.

The use of automation requires a certain robustness from surrounding structures.

AI as part of larger systems

In classical control theory, systems are designed around the principle of stability. A continuously working system, like a production line, is regulated with the help of measured and desired outputs. The problem is to make processes optimal by making them smooth, and get a good output per used resource ratio out of it.

Often, AI is a part of a larger production machinery. The whole process may involve human beings and other machine actors as well. Recent examples of AI victory make a lot of sense when seen in this kind of framing.

If we look at a famous example, Google AlphaGo’s victory over human players was supported by human maintained tournament protocols, servers, and arrangements. Not to speak of news media that helped to sculpt the event when it took place.

The AI’s job was relatively simple as comes to inputs and outputs: receive a board position and suggest the next move. Also how that AI learned to play Go in the first place was a result of multiple years of engineering. Its training was enabled by human work, and its progress was assessed by humans along the way.

The case of adverse outcomes

If we look at organizations, there are always hidden costs when adapting new procedures and processes. Predictive model performance, on the other hand, is largely measured by the number of explicit mistakes that it makes. These kinds of explicit mistakes may capture part of the cost of an automated solution. But fail rate is hardly a comprehensive measure in a complex setting.

Just like in a game some moves may be very costly as regards winning, some mistakes may be very costly to an organization.

One recent observation within the field has been about implicit “ghost” work that goes into keeping up AI appearances: fixing and hiding AI based errors, even fixing AI decisions in the first place before they have time to cause harm.

Now traditional production lines have fallback mechanisms. For example for turning the line off in a case of emergency. Emergency protocols are in place because unexpected events may occur in the real world. This is a very healthy mindset for any AI development also. We should embrace it fully. An organization should take these things into account when planning and assessing a new solution.

No matter how good preliminary results a solution should show, it will start failing sooner or later when something unexpected happens. And it will not fix itself. Its use will probably also create unexpected side effects even when it is doing a superb job.

AWS launches major new features for Amazon SageMaker to simplify development of machine learning models

Machine learning continues to grow on AWS and they are putting serious effort on paving the way for customers’ machine learning development journey on AWS cloud. The Andy Jassy keynote in AWS Re:Invent was a fiesta for data scientists with the newly launched Amazon SageMaker features, including Experiments, Debugger, Model Monitor, AutoPilot and Studio.

December 9, 2019

Miia Niemelä

Anniina Sallinen

AWS really aims to make the whole development life cycle of machine learning models as simple as possible for data scientists. With the newly launched features, they are addressing common, effort demanding problems: monitoring your data validity from your model’s perspective and monitoring your model performance (Model Monitor), experimenting multiple machine learning approaches in parallel for your problem (Experiments), enable cost efficiency of heavy model training with automatic rules (Debugger) and following these processes in a visual interface (Studio). These processes can even be orchestrated for you with AutoPilot, that unlike many services is not a black box machine learning solution but provides all the generated code for you. Announced features also included a SSO integrated login to SageMaker Studio and SageMaker Notebooks, a possibility to share notebooks with one click to other data scientists including the needed runtime dependencies and libraries (preview).

Compare and try out different models with SageMaker Experiments

Building a model is an iterative process of trials with different hyperparameters and how they affect the performance of the model. SageMaker Experiments aim to simplify this process. With Experiments, one can create trial runs with different parameters and compare those. It provides information about the hyperparameters and performance for each trial run, regardless of whether a data scientist has run training multiple times, has used automated hyperparameter tuning or has used AutoPilot. It is especially helpful in the case of automating some steps or the whole process, because the amount of training jobs run is typically much higher than with traditional approach.

Experiments makes it easy to compare trials, see what kind of hyperparameters was used and monitor the performance of the models, without having to set up the versioning manually. It makes it easy to choose and deploy the best model to production, but you can also always come back and look at the artifacts of your model when facing problems in production. It also provides more transparency for example to automated hyperparameter tuning and also for new SageMaker AutoPilot. Additionally, SageMaker Experiments has Experiments SDK so it is possible call the API with Python to get the best model programmatically and deploy endpoint for it.

Track issues in model training with SageMaker Debugger

During the training process of your model, many issues may occur that might prevent your model from correctly finishing or learning patterns. You might have, for example, initialized parameters inappropriately or used un efficient hyperparameters during the training process. SageMaker Debugger aims to help tracking issues related to your model training (unlike the name indicates, SageMaker Debugger does not debug your code semantics).

When you enable debugger in your training job, it starts to save the internal model state into S3 bucket during the training process. A state consists of for example weights for neural network, accuracies and losses, output of each layer and optimization parameters. These debug outputs will be analyzed against a collection of rules while the training job is executed. When you enable Debugger while running your training job in SageMaker, will start two jobs: a training job, and a debug processing job (powered by Amazon SageMaker Processing Jobs), which will run in parallel and analyze state data to check if the rule conditions are met. If you have, for example, an expensive and time consuming training job, you can set up a debugger rule and configure a CloudWatch alarm to it that kills the job once your rules trigger, e.g. loss has converged enough.

For now, the debugging framework of saving internal model states supports only TensorFlow, Keras, Apache MXNet, PyTorch and XGBoost. You can also configure your own rules that analyse model states during the training, or some preconfigured ones such as loss not changing or exploding/vanishing gradients. You can use the debug feature visually through the SageMaker Studio or alternatively through SDK and configure everything to it yourself.

Keep your model up-to-date with SageMaker Model Monitor

Drifts in data might have big impact on your model performance in production, if your training data and validation data start to have different statistical properties. Detecting those drifts requires efforts, like setting up jobs that calculate statistical properties of your data and also updating those, so that your model does not get outdated. SageMaker Model Monitor aims to solve this problem by tracking the statistics of incoming data and aims to ensure that machine learning models age well.

The Model Monitor forms a baseline from the training data used for creating a model. Baseline information includes statistics of the data and basic information like name and datatype of features in data. Baseline is formed automatically, but automatically generated baseline can be changed if needed. Model Monitor then continuously collects input data from deployed endpoint and puts it into a S3 bucket. Data scientists can then create own rules or use ready-made validations for the data and schedule validation jobs. They can also configure alarms if there are deviations from the baseline. These alarms and validations can indicate that the model deployed is actually outdated and should be re-trained.

SageMaker Model Monitor makes monitoring the model quality very easy but at the same time data scientists have the control and they can customize the rules, scheduling and alarms. The monitoring is attached to an endpoint deployed with SageMaker, so if inference is implemented in some other way, Model Monitor cannot be used. SageMaker endpoints are always on, so they can be expensive solution for cases when predictions are not needed continuously.

Start from scratch with SageMaker AutoPilot

SageMaker AutoPilot is an autoML solution for SageMaker. SageMaker has had automatic hyperparameter tuning already earlier, but in addition to that, AutoPilot takes care of preprocessing data and selecting appropriate algorithm for the problem. This saves a lot of time of preprocessing the data and enables building models even if you’re not sure which algorithm to use. AutoPilot supports linear learner, factorization machines, KNN and XGBboost at first, but other algorithms will be added later.

Running an AutoPilot job is as easy as just specifying a csv-file and response variable present in the file. AWS considers that models trained by SageMaker AutoPilot are white box models instead of black box, because it provides generated source code for training the model and with Experiments it is easy to view the trials AutoPilot has run.

SageMaker AutoPilot automates machine learning model development completely. It is yet to be seen if it improves the models, but it is a good sign that it provides information about the process. Unfortunately, the description of the process can only be viewed in SageMaker Studio (only available in Ohio at the moment). Supported algorithms are currently quite limited as well, so the AutoPilot might not provide the best performance possible for some problems. In practice running AutoPilot jobs takes a long time, so the costs of using AutoPilot might be quite high. That time is of course saved from data scientist’s working time. One possibility is, for example, when approaching a completely new data set and problem, one might start by launching AutoPilot and get a few models and all the codes as template. That could serve as a kick start to iterating your problem by starting from tuning the generated code and continuing development from there, saving time from initial setup.

SageMaker Studio – IDE for data science development

The launched SageMaker Studio (available in Ohio) is a fully integrated development environment (IDE) for ML, built on top of Jupyter lab. It pulls together the ML workflow steps in a visual interface, with it’s goal being to simplify the iterative nature of ML development. In Studio one can move between steps, compare results and adjust inputs and parameters. It also simplifies the comparison of models and runs side by side in one visual interface.

Studio seems to nicely tie the newly launched features (Experiments, Debugger, Model Monitor and Autopilot) into a single web page user interface. While these new features are all usable through SDKs, using them through the visual interface will be more insightful for a data scientist.

Conclusion

These new features enable more organized development of machine learning models, moving from notebooks to controlled monitoring and deployment and transparent workflows. Of course several actions enabled by these features could be implemented elsewhere (e.g. training job debugging, or data quality control with some scheduled smoke tests), but it requires again more coding and setting up infrastructure. The whole public cloud journey of AWS has been aiming to simplify development and take load away by providing reusable components and libraries, and these launches go well with that agenda.

Pose detection to help seabird research – Baltic Seabird Hackathon

Team Solita participated in Baltic Seabird Hackathon in Gothenburg last week. Based on the huge material and data set available, we decided to introduce pose detection as a method to understand seabird behavior and interactions. The results were promising, yet leave still room to improve.

November 29, 2019

Kimmo Kantojärvi

Baltic Seabird Hackathon

Some weeks ago we decided to participate in the Baltic Seabird Hackathon in Gothenburg. Hackathon was organised by AI INNOVATION of Sweden, Baltic Seabird Project, WWF, SLU and Chalmers University of Technology. In practise we spent few weeks preparing ourselves, going through the massive dataset and creating some models to work with the data. Finally we travelled to Gothenburg and spent 2 days there to finalise our models, presented the results and of course just spent time with other teams and networked with nice people. In this post we will dive a bit deeper on the process of creating the prediction model for pose detection and the results we were able to create.

Initially we didn’t know that much about seagulls, but during the couple weeks we got to learn wonderful details about the birds, their living habits and social interaction. I bet you didn’t know that the oldest birds are over 45 years old! During the hackathon days in Gothenburg we had many seabird experts available to discuss and ask more challenging questions about the birds. In addition we were given some machine learning and technical experts to support the work in the provided data factory platform. We decided to work in AWS sandbox environment, because it was more natural choice for us.

Our team was selected to have cross-functional expertise in design, data, data science and software development and to be able to work in multi-site setup. During the hackathon we had 3 members working in Gothenburg and 2 members working remotely from Sweden and Finland.

So what did we try and achieve?

Material available

For the hackathon we received some 2000 annotated images and 100+ hours of video from the 2 different camera locations in the Stora Karlsö island. Cameras were installed first time in 2019 so all this material was quite new. The videos were from the beginning of May when the first birds arrive to the same ledge as they do every year and coveraged the life of the birds until beginning of August when most of them had left already.

The images and videos were in Full HD resolution i.e. 1920×1080, which gives really good starting point. The angle of the cameras was above and most of the videos and images looked like the example below. Annotated birds were the ones on the top ledge. There were also videos and images from night time, which made it a bit more harder to predict.

Our idea and approach

Initial ideas from the seabird experts were related to identifying different events in the video clips. They were interested to find out automatically when egg was laid, when birds were leaving and coming back from fishing trips and doing other activities.

We thought that implementing these requirements would be quite straightforward with the big annotation set and thus decided to try something else and took a little different approach. Also because of personal interests we wanted to investigate what pose detection of the birds could provide to the scientists.

First some groundwork – Object detection

Before being able to detect the poses of the birds one needs to identify where the birds are and what kind of birds there are. We were provided with over 2000 annotated images containing annotations for adult birds, chicks and eggs. The amount of annotated chicks and eggs was far less than adult birds and therefore we decided to focus on adult birds. With the eggs there were also issue with the ledge color being similar to the egg color and thus making it much harder to separate eggs from the ledge.

We decided to use ImageAI (https://github.com/OlafenwaMoses/ImageAI) Python library for object detection. It has been built simplicity in mind and therefore it was fast and easy to take into use given the existing annotation set. All we had to do was to transform the existing annotations into PascalVOC format. After all initial setup we trained the model with about 200 images, because we didn’t want to spent too much time in the object detection phase. There is a good tutorial available how to do it with your custom annotation set: https://github.com/OlafenwaMoses/ImageAI/blob/master/imageai/Detection/Custom/CUSTOMDETECTIONTRAINING.md

Even with very lightweight training we were able to get easily over 95% precision for the detections. This was enough for our original approach to focus on poses rather than the activities. Otherwise we probably had continued to develop the object detection model further to identify different activities happening on the ledge as some of the other teams decided to proceed.

Based on these bounding boxes we were able to create 640×640 clips of each bird. We utilised FFMPEG to crop the video clips.

Now we got some action – Pose detection

For years now there has been research and models on detecting human poses from images and videos. Based on these concepts Jake Graving and Daniel Chae have developed DeepPoseKit (https://github.com/jgraving/DeepPoseKit) for detecting poses for animals. They have also focused on making the pose detection much faster than in previous libraries. DeepPoseKit is written in Python and uses in the background TensorFlow and Keras. You can read the paper about the DeepPoseKit here: https://elifesciences.org/articles/47994

The process for utilising DeepPoseKit has 4 main steps:

Create annotation set. This will define the resolution and color of the images used as basis for the model. Also the skeleton (joints and their connections to each other) needs to be defined in csv as a parent-child hierarchy. For the resolution it is probably easiest if the annotation set resolution matches close to what you expect to get from the videos. That way you don’t need to adjust the frames during the prediction phase. For the color scale you should at least consider whether the model works more reliable in gray scale or in RGB color space.
Annotate the images in annotation set. This is the brutal work and requires you to go through the images one by one and marking all skeleton keypoints. The GUI DeepPoseKit provides is pretty simple to use.
Train the model. This definitely takes some time even with GPU. There is also support for augmented data, so you can really improve the model during the training.
Create predictions based on the model.

You can later increase the size of the annotated set and add more images to the set. Also the training can be continued based on existing model and thus the library is pretty flexible.

Because the development of DeepPoseKit is still in the early phases there are at least 2 considerable constraints to remember:

Library can only detect individual poses and if you have multiple animals in the same frames, you need some additional steps to separate the animals
DeepPoseKit only supports image resolutions that can be repeatedly divided by 2 (e.g. 320×320, 640×640)

Because of these limitations and considering our source material, we came up with following process:

So we decided to create separate clips for each identified bird and run pose detection for these clips and then in the end combine the individual pose detection predictions to the original video.

To get started we needed the annotation set. We decided to use the provided sample script (https://github.com/jgraving/DeepPoseKit/blob/master/examples/step1_create_annotation_set.ipynb) that takes in a video and picks random frames from the video. We started originally with 100 images and increased it to 400 during the hackathon.

For the skeleton we ambitiously decided to model 16 keypoints. This turned out to be quite a task, but we managed to do it. In the end we also created a simplified version of the skeleton and the annotations including only 3 keypoints (beck, head and tail). The original skeleton included eyes, different parts of the wings and legs.

This is how the annotations for complex skeleton look like:

The simplified skeleton model has only 3 keypoints:

With these 2 annotation sets we were able to create 2 models (simple with 3 keypoints and complex with 16 keypoints).

To train the model we pretty much followed the sample script provided by the developers of DeepPoseKit (https://github.com/jgraving/DeepPoseKit/blob/master/examples/step3_train_model.ipynb). Due to limited time available we did not have time to work so much with the augmented data, which could have improved the accuracy of the models. Running a epoch with 45 steps took with AWS p3.2xlarge instance (1 GPU) about 5-6 minutes for the complex model. We managed to run around 45 epochs in total given a final validation loss around 25. Because the development is never a such straightforward process, we had to start the training of the model from scratch few times during the hackathon.

The results

When the model was about ready, we run the detections for few different videos we had available. Once again we followed the example in DeepPoseKit library (https://github.com/jgraving/DeepPoseKit/blob/master/examples/step4b_predict_new_data.ipynb). Basically we ran through the individual clips and frame by frame create the skeleton prediction. After we had this data together, we transformed the prediction coordinates resolution (640×640) to match the original video resolution (1920×1080). In addition to the original script we fine-tuned the graphs a bit and included for example order number for each skeleton. In the end we had a csv file containing for each identified object for each frame in the video for each identified skeleton keypoint the keypoint coordinates and confidence percentage. We added also the radius and degrees between keypoints and the distance of connected keypoints. Radius could be later used to analyse for example in which direction the bird is moving its head. In practice for one identified bird this generated 160 rows of data per second. Below is a sample dataset generated.

The results looked more promising when we had more simple setup in the area of the camera. Below is an example of 2 birds’ poses visualised with the complex model and the results seems quite ok. The predictions follow pretty well the movement of a bird.

The challenges are more obvious when we add more birds to the frame:

The problem is clear if you look at one of the identified bird and it’s generated 640×640 clip. Because the birds are so close to each other, one frame contains multiple birds and the model starts to mix parts of the birds together.

The video above also shows that the bird on the right upper corner is not correctly modeled when the bird expands its wings. This is just an indication that the annotation set does not include enough various poses of the birds and thus the model doesn’t learn those poses.

Instead if we take the more simple model in use in the busy video, it behaves a bit better. Still it is far from being optimal.

So at this point we were puzzling how to improve the model precision and started to look for additional methods.

Shape detection to the help?

One of the options that came to our mind was to try leave only the identified bird visible in the 640×640 frames. The core of the idea was that when only individual bird would be visible in the frame then the pose detection would not mess up with other birds. Another team had partly used this method to rule out all the birds in the distance (upper part of the image). Due to the shape of the birds nothing standard such as vignette filter would work out of the box.

So we headed out to look for better alternatives and found out Mask RCNN (https://github.com/matterport/Mask_RCNN). It has a bit similar approach to the pose detection that you first have to annotate a lot of pictures and then train the model. Due to the limited time available we had to try using Mask RCNN just with 20 annotated images.

After very quick training it seemed as the model had really low validation loss. But unfortunately the results were not that good. As you can see from the video below only parts of the birds very identified by the shape detection (shapes marked as blue).

So we think this is a relevant idea, but unfortunately we didn’t have time to verify this idea.

Another idea we had was to detect some kind of pattern that would help identify the birds that are a couple. We played around with the idea that by estimating the density map of each individual bird we could identify the couples that have a high density map. If the birds are tracked then the birds that have a high density output would be classified as a couple. This would end up in a lot of possibilities for the scientists to track the couples and do research on their patterns. For this task first thing we have to do is to put a single marker on each individual bird. So instead of tagging the bird as a whole, we instead tag the head of the bird which is a single point. Image the background as black and the top of the head of each individual bird is marked with the color. The Deep learning architectures used for this were UNET and FCRN(Fully Convolutional Regressional Networks). These are the common architectures used when estimating density maps. We got the idea from this blogpost(https://towardsdatascience.com/objects-counting-by-estimating-a-density-map-with-convolutional-neural-networks-c01086f3b3ec) and how it is used to estimate the density maps which is then used to count the number of objects. We ended up using this to identify both the couples and the number of penguins. Sadly the time was not enough to see some reasonable results. But the idea was very much appreciated by the judges and could be something that they could think about and move forward with.

Another idea would be to use some kind of tagging of the birds. That would work as long as the birds remain on the ledge, but in general it might be a bit challenging as the videos are very long and the birds move around the ledge to some extent.

What next?

Well with the provided results we won the 3rd price in the hackathon. According to the jury biggest achievements were related to the pose detection and the possibilities it opens up for science. It seems that there is not much research done for the social interactions of seagulls and our pose detection model could help on that.

It was clear for all of us that we want to donate the money and have now decided to give it as a scholarship for a student who will take the models and work them further for the benefit of seabird science. Will be interesting to see what the models can tell us about the life of the baltic seabirds, their social interaction and in general socioeconimics of Baltic Sea.

On behalf our Team Solita (Mari Harju, Jani Turunen, Kimmo Kantojärvi, Zeeshan Dar and Layla Husain),

Kimmo & Zeeshan

Greetings from the Bay Area – IBM Think 2019 Part 1

Our Solita crew participated in IBM Think held in San Francisco in February. IBM Think is an annual technology and business conference, where the latest technology trends and new product releases from IBM are introduced.

March 7, 2019

Anu Heikkilä

IBM Think in San Francisco was a huge technology event with approximately 27 000 attendees, thousands of different sessions, presentations and keynotes held in different venues in San Francisco.

Due to the size of the conference we wanted to focus on certain key areas: AI, machine learning and analytics. There were about 500 data and analytics presentations to choose from. Topics covered areas such as data science, AI, business and planning analytics, hybrid data management, governance and integration. IBM Cloud Private for Data alone had 18 sessions where this new product was presented.

Solita has a strong expertise in the area of analytics

Solita has a strong expertise in the area of analytics (Cognos Analytics & Planning Analytics) and we wanted to strengthen our competence and learn about upcoming releases of those products. We had a chance to meet IBM’s offering management and discuss new features and give feedback. There were also several hands-on labs where one could test upcoming features of products.

Although Planning Analytics (PA) was a bit of a sidekick compared to buzzwords like AI and Blockchain, the PA sessions provided good information about the new features and on-going development. In addition, there were several different client presentations providing insights into their CPM solutions. Interestingly, many of those presentations were still focusing on TM1 technology and not on Planning Analytics even though TM1 support will end on 30th of September 2019.

AI and data science were strongly present on IBM Think agenda. Success stories on AI implementations were told for example by Carrefour (retail chain who wanted to optimize existing and new supermarket investment decisions), Nedbank (bank that used predictive maintenance to optimize AMT services), Red Eléctrica de España (electrical company that wanted to predict generation and optimize production) and Daimler (truck manufacturer using AI to comprehend the complexity of product configurations).

Also AI project best practices were shared in many of the sessions.

Also AI project best practices were shared in many of the sessions. Best practices included starting with a quick-win use case to gain buy-in from management and business, having a business sponsor for the project, measuring clear KPIs and business impact and, good quality data, creating effective teams, choosing the right tools, etc. These are all principles we definitely agree on and that are already now implemented in Solita data projects.

What else did we learn in IBM Think 2019? Deep dive into learnings coming up!

A data scientist’s abc to AI ethics, part 2 – popular opinions about AI

In this series of posts I’ll try to paint the borderline between AI and ethics from a bit more analytical and technically oriented perspective. Here I start to examine how AI is perceived, and how we may start to analyze ethical agency.

February 25, 2019

Paavo Toivanen

Multiple images

From 3D apps to evil scifi characters, in everyday use it can mean almost anything. It’s a bit of a burden that it is associated with Terminator, for instance. Or that the words deep learning might receive god like overtones in marketing materials.

Let’s go on with some AI related examples. On a PowerPoint slide, AI might be viewed as an economic force. For yet another example, we could look at AI regulation.

Say a society wants to regulate corporate action, or set limits to war damage with weapon treaties. Likewise, core AI activities might need legal limits and best practices. Like, how to make automatic decisions fair. My colleague Lassi wrote a nice recap about this also from an AI ethics perspective.

Now in my view, new technology won’t relieve humans from ethics or moral responsibility. Public attention will still be needed. Like Thomas Carlyle suggested, publicity has some corrective potential. It forces institutions to tackle their latent issues and ethical blind spots. Just like public reporting helps to keep corporate and government actions in check.

Then one very interesting phenomenon, at least from an analytical perspective, are people’s attitudes towards machines.

Especially in connection to ethics, it is relevant how we tend to personify things. Even while we consciously view a machine as dumb, we might transfer some ethical and moral agency to it.

A good example is my eight year old son, who anticipated a new friend from Lego Boost robot. Even I harbor a level of hate towards Samsung’s Bixby™ assistant. Mine is a moral feeling too.

These attitudes can be measured to a certain extent, in order to improve some models. This I’ll touch a bit later.

Perceived moral agency

There is a new analytical concept that describes machines and us, us with machines. This concept of perceived moral agency describes how different actors are viewed.

Let’s say we see a bot make a decision. We may view it as beneficial or harmful, as ethical or unethical. We might harbor a simple question whether the bot has morals or not. A researcher may also ask how much morals the bot was perceived to have.

Here we have two levels of viewing the same thing, a question about how much a machine resembles humans, and then a less intermediate one about how it is perceived in the society.

I think that in the bigger picture we make chains of moral attribution, like in my Bixby case. My moral emotion is conveyed towards Samsung the company, even if my immediate feelings were triggered by Bixby the product. I attribute moral responsibility to a company, seeing a kind of secondary cause for my immediate reactions. The same kind of thing occurs when we say that the government is responsible of air pollution, for instance.

What’s more to the point, these attributive chains apply to human professionals too. An IT manager or a doctor is bound by professional ethics. Their profession in turn is bound by the consensus within that group. If a doctor’s actions are perceived as standard protocol, it is hard to see them as personal ethics or lack of it.

Design and social engineering

Medical decision assistants and other end products are the result of dozens of design choices. And sometimes design choices, if not downright misleading, voluntarily support illusions.

For instance, an emotional reaction from a chat bot. It might create an illusion that the bot “decides” to do something. We may see a bot as willing or not willing to help. This choice may even be real in some sense. A bot was given a few alternative paths of action, and it did something.

Now what is not immediately clear are a bot’s underlying restrictions. We might see a face with human-like emotions. Then we maybe assume human emotional complexity behind the facade.

Chat bots and alike illustrate the idea of social engineering. What it means is that a technical solution is designed to be easy to assimilate. If a machine exploits cultural stereotypes and roles in a smart way, it might get very far with relatively little intelligence.

A classic example is therapist bot ELIZA from 1960s. Users would interact via a text prompt, and ELIZA would respond quite promptly to their comments. Maybe it asked its “patient” to tell a bit more about their mother. It didn’t actually understand any sentence meanings, but it was designed to react in a grammatically correct way. As the reports go, some of the users even formed an addictive relationship with it.

The central piece of social engineering was to model ELIZA as a psychotherapist. This role aided ELIZA in directing user attention. It might also have kept them away from sizing up ELIZA and its limitations. To read more about ELIZA, you may start from its Wikipedia page.

Engagement and management

ELIZA of course was quite harmless. For toys, it is even desirable to entice the imagination. A human facade can create positive commitment in the user. This type of thing is called engagement in web marketing.

On the other hand, social engineering is hard work and not always rewarding. An interesting related tweet came from a game scriptwriter.

This scriptwriter would wish players to submerge and have profound emotional experiences in her games. In her day to day work she had noticed a constant toil with her characters. Was this need for detail even bigger than, say, in a novel? Yes, she suggested.

The scriptwriter also analyzed this a bit. She noticed that repetitive out-of-context action is likely to distance a user. What’s more, it is also very likely to occur when prolonged interaction is available.

I’m tempted to think that these are the two sides of engaging a user. The catch and the aftermath.

As far as modeling and computational perspective go, another significant theme is the nature of automatic decisions.

The most relevant questions are these. How is the world modeled from the decision making agent’s perspective? What kind of background work does it require? How then about management? What kind of data does the agent consume? How to control data quality?

These will get a bit more detail in my next post. Stay tuned, and thanks for reading!

This is the second post about AI and ethics, in a series of four.

A Data scientist’s abc to AI ethics, part 1 – About AI and ethics

In this series of posts I’ll try to paint the borderline between AI and ethics from a bit more analytical and technically oriented perspective. My immediate aims are to restrict hype meanings, and to draw some links to related fields.

January 25, 2019

Paavo Toivanen

In the daily life we constantly encounter new types of machine actors: in social media; in the grocery store; when negotiating a loan. Some of them appear amiable and friendly, but almost all are difficult to understand deeply. Their constitution may be cryptic and inaccessible.

It’s of course a subject of interest as in which ways algorithms and machine actors impact our society. Maybe they do not remain value-free or neutral in a larger context.

Philosophical and other types of interest

The Finnish Philosophical Society’s January 2019 colloquium targeted these kinds of questions. Talks concerned AI, humanity, and society at large. Prominent topics included the existence of machine autonomy and ethics. One interesting track concerned the definitions of moral and juridic responsibility. Many weighty concepts like humanity, personhood, and the aesthetics of AI, were discussed too.

From a purely philosophical perspective, technology might be viewed as one particular type of otherness. It is something out of bounds of direct personal interest.

On the other hand the landscape around AI may appear supercharged at the moment. Even the word AI reveals many interest vectors. “Whose agenda does the ethics of AI in each case forward?” Maija-Riitta Ollila asked in her presentation.

No wonder many people with a technical background are a bit wary of the term. Often it would be more appropriate to use a less charged one – some good alternatives include machine learning, statistical analysis, and decision modeling.

Between AI and ethics

Most of the talks in the colloquium shared this very sensible view that AI as a term should be subject to critique. One moral responsibility then for tech people is just shooting down related hype.

But the landscape of AI and ethics is complex and controversial. As if to back this observation, many presenters in the colloquium openly asked the audience to correct them on technical points if they should go wrong.

For instance, cognitive and emotional modeling are named as two quite distinct areas of research within cognitive science and neuroscience. The first holds much more progress than the other, when we compare their achievements. Logic is relatively easier to simulate than emotional attitudes. We may equate this with the innate complexity of human action and information processing that this simulation platform only exemplifies.

Furthermore, as illustrated by many intriguing thought experiments, problems arise when we try to attribute an ethical or moral role to a machine actor. Some of these I’ll try to explicate in later posts.

A bit of a discomfort for me has been the relationship between AI discussion and ethics. Is the talk always morally sound? Sometimes it felt that ethics won’t fit into the world of AI marketing. If I should define ethics with a few words, I would probably state that it is deep thinking about prevalent problems of good and bad.

Some wisdom about AI

We may juxtapose this with a punchline about contemporary AI. “[The] systems are merely optimization machines, and ultimately, their target is optimization of business profit”, one fellow Data scientist wryly commented to me.

So on the surface level, computer science and mathematical problems might not connect to ethics at all. The situation may be alike in sales and marketing. Also in philosophy, formal logic on the one hand and ethics and cultural philosophy on the other are largely separate areas.

What to make of this divide? My next post will examine popular perceptions of AI in the wild.

This is the first of four posts that will handle the topics of AI and ethics from a bit more technical angle.