Am I out of my mind trying to automate this in R?
October 6, 2016 5:39 PM Subscribe
I built a machine that (mostly) works, but I omitted an element from the spec. The machine's required logging works - it accurately shows what goes wrong, but a facility for inverting the logs to show errors is what I omitted. Therefore, I need a tool that will help me and others diagnose it when it fails, and the tool in my bag that has worked for me so far is R. Is this even the right bag from which to keep working?
Frankly speaking I am out of my depth. I'm not a coder but I can get the requisite information out of the logs, using a bunch of manipulations in R, plus examination I do not know how to encode. These manipulations don't exist as a single script one can run against the measurement log, and that is what I need to create.
Imagine a machine that consists of four modules: Ms, M1, M2 and M3. The numbered modules are state machines that can occupy 64 states, 48 of which are illegal. States during transition time of physical equipment are legal, for known intervals: some equipment takes several records to transition from one state to the next. The logs are 184 columns x 6000 - 10,000 rows of a data.frame, 64 of which are digital outputs and another 64 of which are digital inputs that cover all four modules. Module Ms has correct and incorrect conditions but it is not a state machine: those are adjudged using analog information and are not the problem here.
Using the digital I/O and the timestamps, I can judge easily when M1, M2, or M3 has entered an illegal state. I need to identify them, ultimately by timestamp which is column 1 of the data.frame.
To automate this and take the hand examination out of the question, should I learn to examine the parsed log files with different queries, or generate a bunch of by-product data.frame objects and process those, then throw them away - which is the hand-examination technique I am using now? The queries would need to examine DIs and DOs like this:
if (DO43 is asserted) AND (DI11 transitions from 1 to 0 within X entries) AND (DI12 transitions from 0 to 1 within X+Y entries) AND NOT DI13 then legal; else add a notation flagging an illegal state to an error column in the data.frame .
I'm no neophyte at R, but I do not understand how to automate the process of stepping through a data.frame to perform multiple tests that span multiple records. Tests for legality need to be applied to each record in its context of nearby records and so far I don't understand how to do that.
If writing a decent log-decoder for this situation in R is not a reasonable task, what should I look for in a proposal to do it?
Frankly speaking I am out of my depth. I'm not a coder but I can get the requisite information out of the logs, using a bunch of manipulations in R, plus examination I do not know how to encode. These manipulations don't exist as a single script one can run against the measurement log, and that is what I need to create.
Imagine a machine that consists of four modules: Ms, M1, M2 and M3. The numbered modules are state machines that can occupy 64 states, 48 of which are illegal. States during transition time of physical equipment are legal, for known intervals: some equipment takes several records to transition from one state to the next. The logs are 184 columns x 6000 - 10,000 rows of a data.frame, 64 of which are digital outputs and another 64 of which are digital inputs that cover all four modules. Module Ms has correct and incorrect conditions but it is not a state machine: those are adjudged using analog information and are not the problem here.
Using the digital I/O and the timestamps, I can judge easily when M1, M2, or M3 has entered an illegal state. I need to identify them, ultimately by timestamp which is column 1 of the data.frame.
To automate this and take the hand examination out of the question, should I learn to examine the parsed log files with different queries, or generate a bunch of by-product data.frame objects and process those, then throw them away - which is the hand-examination technique I am using now? The queries would need to examine DIs and DOs like this:
if (DO43 is asserted) AND (DI11 transitions from 1 to 0 within X entries) AND (DI12 transitions from 0 to 1 within X+Y entries) AND NOT DI13 then legal; else add a notation flagging an illegal state to an error column in the data.frame .
I'm no neophyte at R, but I do not understand how to automate the process of stepping through a data.frame to perform multiple tests that span multiple records. Tests for legality need to be applied to each record in its context of nearby records and so far I don't understand how to do that.
If writing a decent log-decoder for this situation in R is not a reasonable task, what should I look for in a proposal to do it?
This is something I might solve with unix command line tools piped together: sort, grep, cut, uniq, and either (sed and awk) or perl, with the logic to look at current vs prev state in the perl part.
posted by zippy at 5:54 PM on October 6, 2016 [1 favorite]
posted by zippy at 5:54 PM on October 6, 2016 [1 favorite]
Best answer: ... which is to say, if you can describe how to detect the error conditions, this is likely enough of a spec for a programmer with experience in log analysis to write tools to automate this detection.
posted by zippy at 5:57 PM on October 6, 2016
posted by zippy at 5:57 PM on October 6, 2016
Response by poster: Thanks, zippy, I did not know log analysis was a thing and that is what I need.
posted by jet_silver at 7:28 PM on October 13, 2016
posted by jet_silver at 7:28 PM on October 13, 2016
This thread is closed to new comments.
posted by demiurge at 5:49 PM on October 6, 2016