Plot anomaly projection using dimensionality reduction
Source:R/anomaly_plot.R
plot_anomaly_projection.RdCreate 2D visualization of anomalies using PCA or UMAP dimensionality reduction. Anomalies are highlighted in red, normal points in blue. For large datasets (>3000 points), automatic sampling is applied to improve speed.
Usage
plot_anomaly_projection(
model,
data,
method = "mad",
dim_reduction = c("pca", "umap"),
contamination = 0.05,
point_size = 2,
point_alpha = 0.6,
umap_n_neighbors = 15,
umap_min_dist = 0.1,
sample_rate = 0.05
)Arguments
- model
An isoForest model object
- data
The original data (must be numeric)
- method
Threshold method for anomaly detection (default: "mad")
- dim_reduction
Dimensionality reduction method: "pca" or "umap" (default: "pca")
- contamination
Contamination rate (only used if method = "contamination")
- point_size
Size of points in the plot (default: 2)
- point_alpha
Transparency of points (default: 0.6)
- umap_n_neighbors
Number of neighbors for UMAP (default: 15)
- umap_min_dist
Minimum distance for UMAP (default: 0.1)
- sample_rate
Target anomaly rate in the displayed data (default: 0.05). The function will sample normal points so that anomalies represent this proportion of total displayed points. Set to NULL to disable sampling. For example, if sample_rate = 0.05 and there are 100 anomalies, the total displayed points will be approximately 2000 (100/0.05).
Examples
if (FALSE) { # \dontrun{
# Using PCA for dimensionality reduction
model <- isoForest(iris[1:4])
plot_pca <- plot_anomaly_projection(model, iris[1:4], dim_reduction = "pca")
print(plot_pca)
# Using UMAP for dimensionality reduction
plot_umap <- plot_anomaly_projection(model, iris[1:4], dim_reduction = "umap")
print(plot_umap)
# For large datasets, automatic sampling is applied
# large_data <- matrix(rnorm(5000 * 5), ncol = 5)
# model <- isoForest(large_data)
# plot_anomaly_projection(model, large_data) # Samples so anomalies = 5% of display
# Custom sample rate: show fewer points (anomalies = 10% of display)
# plot_anomaly_projection(model, large_data, sample_rate = 0.10)
} # }