Package 'nmfbin' reference manual

Title:	Non-Negative Matrix Factorization for Binary Data
Description:	Factorize binary matrices into rank-k components using the logistic function in the updating process. See e.g. Tomé et al (2015) <doi:10.1007/s11045-013-0240-9> .
Authors:	Michal Ovadek [aut, cre, cph]
Maintainer:	Michal Ovadek <[email protected]>
License:	MIT + file LICENSE
Version:	0.2.1
Built:	2025-02-02 03:40:07 UTC
Source:	https://github.com/michalovadek/nmfbin

Logistic Non-negative Matrix Factorization

Description

This function performs Logistic Non-negative Matrix Factorization (NMF) on a binary matrix.

Usage

nmfbin(
  X,
  k,
  optimizer = "mur",
  init = "nndsvd",
  max_iter = 1000,
  tol = 1e-06,
  learning_rate = 0.001,
  verbose = FALSE,
  loss_fun = "logloss",
  loss_normalize = TRUE,
  epsilon = 1e-10
)
nmfbin(
  X,
  k,
  optimizer = "mur",
  init = "nndsvd",
  max_iter = 1000,
  tol = 1e-06,
  learning_rate = 0.001,
  verbose = FALSE,
  loss_fun = "logloss",
  loss_normalize = TRUE,
  epsilon = 1e-10
)

Arguments

`X`	A binary matrix (m x n) to be factorized.
`k`	The number of factors (components, topics).
`optimizer`	Type of updating algorithm. `mur` for NMF multiplicative update rules, `gradient` for gradient descent, `sgd` for stochastic gradient descent.
`init`	Method for initializing the factorization. By default Nonnegative Double Singular Value Decomposition with average densification.
`max_iter`	Maximum number of iterations for optimization.
`tol`	Convergence tolerance. The optimization stops when the change in loss is less than this value.
`learning_rate`	Learning rate (step size) for the gradient descent optimization.
`verbose`	Print convergence if `TRUE`.
`loss_fun`	Choice of loss function: `logloss` (negative log-likelihood, also known as binary cross-entropy) or `mse` (mean squared error).
`loss_normalize`	Normalize loss by matrix dimensions if `TRUE`.
`epsilon`	Constant to avoid log(0).

Value

A list containing:

W: The basis matrix (m x k). The document-topic matrix in topic modelling.
H: The coefficient matrix (k x n). Contribution of features to factors (topics).
c: The global threshold. A constant.
convergence: Divergence (loss) from X at every iter until tol or max_iter is reached.

Examples

## Not run: 
# Generate a binary matrix
m <- 100
n <- 50
X <- matrix(sample(c(0, 1), m * n, replace = TRUE), m, n)

# Set the number of factors
k <- 4

# Factorize the matrix with default settings
result <- nmfbin(X, k)

## End(Not run)
## Not run: 
# Generate a binary matrix
m <- 100
n <- 50
X <- matrix(sample(c(0, 1), m * n, replace = TRUE), m, n)

# Set the number of factors
k <- 4

# Factorize the matrix with default settings
result <- nmfbin(X, k)

## End(Not run)

Package 'nmfbin'

Help Index

Logistic Non-negative Matrix Factorization

Description

Usage

Arguments

Value

Examples