Intuceo help

Glossary
Glossary contains an alphabetic list of data science terminology used by Intuceo.
Actionability: An attribute, insight, extra., is actionable if a business user can use it to take business decisions. For instance, the business user cannot change or make a decision based on the attribute Gender, whereas the attribute CampaignMode that has direct, email, phone, and pamphlet options is actionable because the business user can modify the options and evaluate the impact.
Attribute: The data categories found in spreadsheet column headers may be called by many names (e.g., attributes, variables, predictors, features, independents, classes, extra.). Intuceo uses the term “attribute" and classifies attributes as follows:
categorical: These can be nominal (e.g., gender, customer ID, zip code, extra.) or ordinal (e.g., ranking by height [tall, medium, short], age [senior, middle-aged, young adult], extra.).
Numeric: For example, height, temperature, income, extra. Numeric attributes can be defined as ranges or as intervals (e.g., between dates, ranges of temperature, extra.).
Bin size: See level.
Bin: See level.
Categorical attribute: See attribute.
Class level: one of the levels of a target attribute. See level.
Classifier: one or more algorithms that map inputs (attributes) to a categorical output (target attribute). A classifier might perform this function to, for example, detect fraudulent cases; determine whether tumor cells are malignant or benign; classify transactions as legitimate or fraudulent; sort secondary structures of proteins into alpha-helix, beta-sheet, and random coil; categorize news stories as finance, entertainment, and sports; extra.
Confidence: the strength or reliability of an insight. It is defined as the percentage of data points supporting the insight out of all data points supporting the insight body. A data point supports the insight body if it contains all the items of the insight body.
Equal frequency: when the data is divided into k groups where each group has approximately the same value. For both equal frequency and equal width, the best way to determine k is to look at the histogram and experiment with different intervals or groups. See also equal width.

Equal width: when the data is divided into k intervals of equal size. The width of intervals is w = (max-min) / k. The interval boundaries are min + w, min + 2w, ... , min + (k 1)w. See also equal frequency.

error: If the estimated output differs from the actual value, it is known as an error. An error is generally represented in a confusion matrix. Following are the metrics produced by a confusion matrix:
Accuracy: the percentage of correctly identified matches detected from possible matches. This metric is used for the entire dataset (i.e., correct matches include all classifier and null matches). Accuracy = no. of correct matches / total no. of matches possible.

False negative: a test result that is incorrect because the test failed to recognize an existing condition or finding.
False positive: a test result that is incorrect because the test indicated a condition or finding that does not exist.
False positive rate (FPR): how many incorrect positive results occur among all available negative samples.
Precision: the percentage of correctly identified matches detected from matches made. Precision = no. of correct matches / ((no. of correct matches + no. of wrong matches).
Recall: the percentage of correctly identified matches detected from available matches. Recall = no. of correct matches / (no. of correct matches + no. of missed matches).
true negative: a test result that does not detect the condition when the condition is absent.
True positive: a test result that detects the condition when the condition is present.
True positive rate (TPR): how many correct positive results occur among all available positive samples.
Hidden insight: Intuceo uses three different methods (i.e., divide conquer, separate conquer, and evolutionary methods) to generate actionable insights that business users can easily understand and apply.
Hypothesis: a hunch, rule of thumb, educated guess, best practice, or insight intended to solve a business problem or positively influence the behavior of a target attribute. Any hypothesis must be validated (i.e., proven right or wrong) before it can be safely and reliably used as a basis for business decisions.
Imputation: the process of replacing missing data with substituted values. The substituted values are based on statistical methods. Intuceo uses random forest imputation and central imputation methods to fill the missing data.
Insight body: The antecedents of an insight. For example, in “If gender = male and income > $10,000 then class = buyer, the insight body is “If gender = male and income > $10,000. See insight head.
Insight group: Insight groups help you to distinguish sets of insights that do not have a direct or indirect relationship. Insights that share the same insight head or one of the conditions in the insight body can form an insight group.
Insight head: The consequent of an insight. For example, in “If gender = male and income > $10,000 then class = buyer, the insight head is “then class = buyer See insight body.
level: Discrete values of attributes are called levels. If the numeric data is discretized into different levels, the levels are known as bins. Thus, both levels and bins represent discrete values of attributes. For example, if age is discretized into three bins (e.g., 20, 2040, > 40), the number of records in each bin is referred to as the bin size.
Lift: the factor by which the confidence exceeds the expected confidence. It is determined by dividing the confidence of the insight by the support of the insight head.
No. of records: number of records that satisfy an insight.
Non-triviality: how explicable an insight is. The number of conditions in the antecedent of an insight is indirectly proportional to its explicability. The more conditions in the antecedent, the less explicable it is, and hence the lower its non-triviality score and vice versa. Non-triviality = round (1.1765 * exp (-0.163 * AttributeCount), 2).
Numeric attribute: See attribute.
Quick insight: Intuceo uses an information-theory approach to quickly scan the data and generate a few useful insights for each class level.
ROC space: defined by the FPR and the TPR as the x-axis and the y-axis, respectively, it depicts the relative trade-offs between true positives (benefits) and false positives (costs). The prediction results of several models represent a point in the ROC space. The best possible prediction method would yield a point in the upper left corner (or coordinate 0.1, also called a perfect classification). A completely random guess would give a point along a diagonal line (the so-called line of no discrimination) from the left bottom to the top right corner. Points above the diagonal line represent good classification results (better than random) while points below the line represent poor results (worse than random).

Score: the harmonic mean of support, lift, and confidence calculated by the following formula: 3 * support * confidence * lift / (support * confidence + confidence * lift + support * lift).
support: the ratio of data points supporting an insight to the total number of data points in the database.
Target attribute: The value of certain attributes may change when other attributes are modified. Such a changeable attribute may be called by many names (e.g., target attribute, target class, dependent variable, class variable, response variable, extra.) In Intuceo, when the objective is to adjust and/or experiment with a changeable attribute in order to achieve an actionable business outcome, that attribute is designated as the “target attribut of the analysis.
Test data: a subset of data, generally 20%‚30% of the entire dataset, used to determine the accuracy of the model. See training data.
Training data: a subset of data, generally 70%,80% of the entire dataset, used to build a model. See test data.
EC2 Elastic Compute Cloud
FTP File Transfer Protocol

Account Access
In order to use, one has to register with Intuceo by means of a user id and password. User id is the email id with which the user is registered with. User id once created can not be changed. Access will be provided after a successful login to Intuceo with the combination of registered user id and password. It is recommended not to share the user credentials.
Intuceo password policy:
· Password should be at least 8 characters long and contain at least 1 uppercase, 1 lowercase, 1 numeral and 1 special character
· Five attempts are available to the user to login to Intuceo, in case if the user entered an unregistered password by mistake. After that the account will be locked. User needs to contact support team or administrator to unlock the account.
· All passwords need to be changed after 90 days (or based on a configurable value in terms of days) from the last change of the password. Intuceo displays message to change password from 30 days before the due date.
· When password is being changed, Intuceo will not accept the new password if it is same as last 2 passwords.
· Intuceo provides the following facilities to the user.
Change password – User can change the current password while the user is logged into Intuceo. Click on Change password and provide old and new passwords to Intuceo.
Forgot password – When this link is clicked, an email to reset the password is sent to the user’s registered email id. User need to click on the link in the email and provide new password