Skip to contents

This function implements the Isolation Forest algorithm for anomaly detection by 'ranger'.

Usage

isoForest(
  data,
  num_trees = 500,
  sample_size = min(nrow(data), 256L),
  max_depth = ceiling(log2(sample_size)),
  mtry = NULL,
  num.threads = NULL,
  seed = NULL,
  ...
)

Arguments

data

A data frame or matrix containing the data to be analyzed.

num_trees

The number of trees to be grown in the forest. Default is 500.

sample_size

The size of the sample to be used for each tree. Default is the minimum of the number of rows in the data and 256.

max_depth

The maximum depth of each tree. Default is the ceiling of the logarithm base 2 of the sample size.

mtry

The number of variables to consider when splitting each node. Default is NULL, which means that the number of variables is set to the square root of the number of variables in the data.

num.threads

The number of threads to use for parallel processing. Default is NULL, which means that all available threads are used.

seed

The seed for random number generation. Default is NULL, which means that the current time is used as the seed.

...

Additional arguments to be passed to the ranger function.

Value

A list containing the anomaly scores for each data point. The anomaly scores are calculated as the average path length from the data point to the root of the tree.

Examples

# Load the required libraries
library(ranger)
library(isoForest)
# Load the data
data <- iris
# Train and predict the Isolation Forest model by 'ranger'.
result <- isoForest(data)