Lou Zhang

Experiments in Optimizing Tool Life with Amperage Data (Part 1)

Introduction

A major challenge in manufacturing is predicting how long cutting tools in your machine are going to last. They get worn out eventually after steadfastly grinding through really hard metal, creating part ($$) after part ($$) for you.

Tools are “rated” for a certain life by the manufacturer, but this varies highly based on the type of material being cut and ambient conditions in the machine shop. In response, operators have developed some rules of thumb to determine when to change them out.

1. Tools are often run to failure and then backed out by X number of parts. For example, Tool 4 runs until it breaks, and the operator notes how many parts it was able to get through before crashing. Then, they determine that a “safe” limit might be 100 parts less than what it was able to do on that run. But this usually results in going through more tools than you need to and therefore unnecessary cost.

2. Quality control departments see certain markings on finished parts that indicate that a tool is about to give. They then communicate to the operator that these signs have shown up and tools are switched out immediately. But then you have a batch of parts that have already been damaged by the worn out tool and need to be thrown out.

3. “Smart tools” coming into vogue have built-in sensors that can alert you when the tool needs to be changed. But these are often expensive and require integration with another analytics platform, subscription costs, etc.

4. Experienced operators, through years of operating machine tools, can hear when a tool is about to break. These differences are often subtle and subjective. But talent is becoming harder and harder to find and retain on manufacturing floors. The old “I can hear when something is wrong” approach may fade as a relic of the 20th century.

Of the above, method #4 appears to be the most widely used. MachineMetrics wants to be well-positioned to replace accumulated domain knowledge with an automated approach as experienced operators retire and a talent void appears when trying to replace them. In this series, we’ll cover a sensor-based method to automatically optimize life on a subset of tools used in machining.

Hardware Considerations

In our last series, we noted how our data comes straight off the controls of the machine and therefore doesn’t require aftermarket sensor installations. However, sensors can still be useful to supplement control data, as in the case of tool life optimization. Over the last few months, our engineering team has developed a method to seamlessly integrate sensor data with control data and combine them into one data stream coming off the machine.

We considered many different types of sensors for this project — vibration, temperature, acoustic, etc. We realized that these sensors are highly sensitive to ambient conditions and outside interference. For an obvious example, a person bumping the machine would offset our readings for vibration, and construction going on in the factory might interfere with acoustic readings.

 
An early prototype vibration sensor setup with a curiously bemused data scientist

It should also be noted that wireless sensors may get around these issues because they can be placed in very isolated spaces in the machine, but we discovered wireless sensors are not commercially feasible because they require battery replacement. Most wireless sensors out in the field nowadays only transmit data every couple of minutes, or even hours/days, focusing on long-term degradation of machines that can be picked up with infrequent readings. Batteries for these sensors might only need to be replaced every few years, resulting in a feasible service model, but transmission of information every second or a few times every second would require the battery to be replaced nearly every month.

The MM I/O Device allows us to interface with any wired sensor

Thus, our problem space does not make wireless sensors practical, forcing us to place wired sensors in locations that are near our Edge or I/O Device (see left) device. The I/O records voltage output from analog and digital devices, is connected via Ethernet, and needs to be powered through an outlet.

Under very specific circumstances, wired sensors may work, but are by no means generalizable due to the inherently diverse nature of manufacturing.

However, there is one type of sensor which is less prone to interference — the current transducer (CT). The CT latches onto a live wire in the machine and captures the amperage flowing through the wire. This is a way of capturing power draw, which we have found to be the most unadulterated signal. But, power draw can still be “impure,” especially if the CT is clamped to a wire that delivers power for multiple purposes.

 
The current transducer, which is clamped to the spindle power

For example, if the CT is clamped to the main power for the machine, auxiliary functions like the coolant pump turning on/off may interfere with the purity of the reading. To get around this issue, we clamped to one of the three phase legs of the spindle power supply. The spindle power is the direct power feed to the cutting tool of the machine. Thus, we only read power from machining operations.

 
A comparison of spindle (top) vs. main (bottom) amperage. Notice how the spindle amperage has much less noise

Our Hypothesis

We start with the question of how to predict tool life from the machine’s spindle power draw, as collected from a CT. Our hypothesis is that as the tool gets more and more worn, more and more power is required to do the same amount of work. Since machine tools are commanded to produce the same part over and over again with precision, the machine’s power draw compensates for the dullness of the tool.

 
However strong a tool is, it gets worn out over time. More power is needed to do the same amount of work with a duller tool

Therefore, our desired result is to see a correlation between amperage drawn and the number of parts produced.

Though this may seem simple in concept, there was significant effort involved with coming up with a suitable model to test our hypothesis. The first question we needed to answer was: how do we appropriately quantify power draw? How do we come up with a model of how much work was done by the tool? Based on domain knowledge, we decided to quantify it as the integral of the amperage drawn.

This is because a tool can likely perform a certain amount of work (absorb a total amount of energy) over its useful lifespan. We can equate the CT amperage information to trend with a specific tool’s power draw. That value integrated over time would be representative of the work a tool has done. There may be a change in the work required near the end of a tool’s useful life, or a gradual shift upwards in the amount of work required, for which we could set a threshold and alerts.

A visualization of our hypothesis, where s represents the amount of work done by a tool for a single part creation. Each part-tool pair has a single s value (in amp hours [Ah]).

Once we obtain the amount of work done by a single tool for a single part (s), we want to continue obtaining that value across time for all tools, by part. In the end, we’d like to generate a vector of s’s for each tool, like the following:

  1. Tool 1:Part 1 = 5 Ah, Tool 1:Part2 = 5.2 Ah, Tool1:Part3 = 5.3 Ah… 
    Total amount of work done across all parts for Tool 1= {5, 5.2, 5.3,...}
  2. Tool 2:Part 1 = 2 Ah, Tool 2:Part2 = 2.2 Ah, Tool2:Part3 = 2.3 Ah
    Total amount of work done across all parts for Tool 2= {2, 2.2, 2.3,…}
  3. Tool 3:Part 1, Tool 3:Part2, Tool3:Part3… 
There are usually multiple tools involved in a machining operation. Each tool typically breaks individually

Before we’re able to continue, we need to investigate the data we can collect off each machine, perform some data wrangling, and think about how to structure the problem from a data science perspective.

Data Wrangling and Exploratory Data Analysis

We stream data for the following fields off each machine:

  1. Timestamp
  2. Tool ID
  3. Part Count Increments
  4. Machine Status
  5. Machine Faults
  6. Current Transducer (amps) 
A snapshot of data fields we’re working with

These fields are a combination of both control and sensor data. We can use each one of these data fields to help us achieve our goal.

Looking at the raw amperage data coming off one of the machines, we see that there’s significant work that needs to be done to get this into a usable form.

A big, ugly mess.

Handling Inactive and Faulty Machining Periods

We first need to exclude periods where the machine is inactive — supposedly there shouldn’t be any power draw during this period, but it appears as if this still happens. This is because the spindle at rest consumes power, and this pattern is exacerbated because the machine sometimes self-configures to a default or “safer” resting position when the program is stopped. We don’t want this power consumption to be mixed in with actual work done for machining.

 
Power is still consumed when status is not active, so we need to filter those modes out

Let’s plot execution status over time, where blue is not active and red is active.

Given the amount of data points, we should zoom into a specific time period to see the effects of filtering out inactive periods. On the top we have unfiltered data, and the bottom is filtered.

 

Zooming in one level further (to one of those big vertical lines) and doing the same, we see the following:

 
 

Once we filter out inactive time periods, we need to exclude part numbers where there were faults or anomalies, as these will skew the results of our analysis.

 
A sample of the types of faults we can see. Part numbers 26, 47, 100, 326 and 327 get axed.

Auditing Part Level Consistency

Next, we need to do a sanity check on power draw by part number. Each part in total should have somewhat consistent amperage draw; let’s make sure this is the case. 

As is usually the situation with manufacturing data, this does not appear to be the case. There is significant inconsistency in the number of minutes it takes to manufacture a part. These could be incorrect part counts, warmup parts, or some other aberration of the data collection or machining process.

We use a simple method to filter out anomalous part times, which is to find the median part creation time and set a ±20% tolerance around it. After filtering out those parts, we have a much cleaner plot. There’s still some variance, but this is expected since there could be localized variability in material hardness, chip buildup, voltage drift and/or possible changes in coolant flow. 

Auditing Tool Level Consistency

We also want to isolate power draw by Tool ID. This is because each tool performs a different operation and is going to draw a different amount of power for each part. Tools also break individually; it’s not like every single tool in the machine is going to break at the same time.

Let’s visually inspect the pattern of tool changes to make sure they’re consistent. In the below graph, the tool being used at that time is denoted as a horizontal line, with different tools on different levels.

 
A consistent tooling pattern occurs by part

Though we see a clear pattern at first glance, we need to verify this across the board. We audited the number of tools used per part to make sure they’re consistent (in this case, 12 tools/part).

After verifying for consistency, we can proceed to the next step, which is to take the integral for each part-tool combo, and string together all the part-level amperages by tool. The diagrams below illustrate this process for tool 6001.

Example Part

Tool 6001 is used at the same time during each one of our parts. To review, we want to construct a vector where we obtain the total amount of work done across all parts, by part, for Tool 6001.

 
Each red area, representing the amperage consumed by one tool, cutting its share of one part, is about .08 Ah. Given a 208V input, 16.64 wH of power is consumed (Watts = Volts * Ah). For reference, your phone’s battery has about 2 Ah with a 3.7V input, so 7.4 wH, or less than half of what it takes here, is used when discharging your battery completely.

To break this down,

Tool 6001:Part 1 = .08 Ah, Tool 6001:Part2 = .09 Ah
Total amount of work done across all parts for Tool 1= {.08 Ah, .09 Ah…}

Once we string up all these individual values per part, we get a representation of the trend of work done, by part (x-axis) for the tool.

 

A bit more is required after this point to verify our hypothesis, which we’ll cover in the next part. It should also be noted that this analysis template can be used not just for tool wear, but can be extended to other applications like determining the size of a grinding wheel, graphing out changes to the physical infrastructure of the machinery, etc. Analyzing amperage in a machine is like analyzing the heartbeat of a person— many insights can be realized, problems can be diagnosed, and truth can be revealed through the study of power.

Stay tuned for Part 2, where we explain how to automate the detection of potentially troubling amperage patterns.

Topics: Data Science

MORE POSTS

Detecting CNC Anomalies with Unsupervised Learning (Part 4)

October 10, 2018
In the fourth installment of this series, we’ll take a look at some examples of anomalies we caught in live machining environment. We’ll then discuss how we put this into production.

Read More

Detecting CNC Anomalies with Unsupervised Learning (Part 3)

October 10, 2018
In parts 1 and 2, we discussed the business problem and preprocessing involved with detecting anomalous behavior on machines. In this post, we’ll cover some creative data wrangling and clustering methods. This piece will be more t...

Read More

Detecting CNC Anomalies with Unsupervised Learning (Part 2)

October 5, 2018
In the last post of this series, we went over why it was important to try and detect anomalous behavior on machines. In this post, we’ll dive right into how we preprocessed and cleaned the data.

Read More