I am trying to find a group of set of alarms which lead to a failure and following apriori algorithm for it. I have a small dataset which comprises of 70 failure units and 750 healthy units. I am trying to take a standard 2 percent of support and 0.75 of confidence as a starter to get LHS - > { failure}
however no relations appear.
How do we need to set our support, confidence and lift when we don't get the desired relation?
Tags:
A general approch would be use the alarm data as sequential data and use sequential data mining algos to find rules. Fault related alarms would be very less in percentage terms in the database of alarms so you may have to reduce support value to 0.01%. A good starting place could be Zaki ( 2004 ) or CSPADE algo which is avaible in R package called Arules and A Rules Sequences. Those rules would be general in nature,after getting the rules you can pick the rules with fault related alarms in the right hand side ( consequnrt). In this process you could also apply additional parameter ( time confidence ) which is not available in CSADE you have write a customer function, the time confidence parameter coulld used to forecast the "timing of fault given {lhs}.
Thanks,
I am using alteryx for it and using association rules i am trying to find out the rules for the problem statement. I have kept my support as low as possible but if i am keeping my support and confidence extremely low say 1 percent each, wouldn't the rules be inconclusive to prove our hypothesis of a certain alarm leading to a failure.
Why cant we neglect the healthy data in this case and just simply work on the failure data ?
Jishnu Bhattacharya said:
A general approch would be use the alarm data as sequential data and use sequential data mining algos to find rules. Fault related alarms would be very less in percentage terms in the database of alarms so you may have to reduce support value to 0.01%. A good starting place could be Zaki ( 2004 ) or CSPADE algo which is avaible in R package called Arules and A Rules Sequences. Those rules would be general in nature,after getting the rules you can pick the rules with fault related alarms in the right hand side ( consequnrt). In this process you could also apply additional parameter ( time confidence ) which is not available in CSADE you have write a customer function, the time confidence parameter coulld used to forecast the "timing of fault given {lhs}.
Simple support Low and Confidence high !! After this step prune rules with rhs refers to fault.
Hai,
Thanks for sharing your knowledge. As i am not much Familiar about this one. But, I am having some knowledge about this and i want to share with you. As Rajat Jain said, I also used Alteryx for it. I got the solution for Problem statement. We cant neglect the healthy data.
Regards,
Riya
© 2018 Data Science Central ® Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service