Data Science Dictionary 2022 A/B testing Abatement Costs Absolute Difference Absolute Error Absorbing Markov Chains Abundance Matrices Accelerated Failure Time Model Accelerated Life Testing Acceptable Risk Acceptance Region Acceptance Sampling Accident Proneness Additive Effect Additive Model Additive Outlier Adjacency Matrix Adjusted treatment means ACORN Acquiescence bias Actuarial statistics Adaptive Cluster Sampling Accuracy Score Activation Function Activation Classification Adam Optimization Age-related reference ranges Alpha Akaike information criterion Anaconda Analytical Categories Anamoly Detection Annuity Rate Arcsine transformation Arithmetic Growth Apache Spark API Area Under the Curve ARIMA ARMA Artificial Intelligence Artificial Neural Networks Association Assumptions Asymmetrical Distribution Asymmetric maximum likelihood (AML) Asymmetric proximity matrices Attribute Audit Trail Augmented Reality Autocorrelation Autoregression Average Precision Average Sample Number Backcasting Backpropagation Backfitting Bagging Bagplot Bayes’ Theorem Bayesian Network Bernoulli Distribution Bernoulli Trial Bias Bias-Variance Tradeoff Big Data Binomial Distribution Blending Block Clustering Blocking BLUE BMI Boosting Bootstrap Boundary Estimation Box and Whisker plot Box Counting Method Box-Cox Transformation Box-Muller Transformation Breslow-Day Test Breusch-Pagan test Brier Score Broadband Smoothing Brushing Scatter plots Business Analyst Business Intelligence C C++ Calibration Calendarization Canonical Correlation Analysis Capture- Recapture Sampling CAR CART Cartogram Cascaded parameters Case-cohort Study Case-Control Study Catastrophe theory Categorical distribution Categorical Variable Causality Ceiling effect Censored regression models Census Centering Centile Centile Reference Charts Central Limit theorem Central range Chain-binomial models Change Point Problems Chaos Chemometrics Chi-Square Test Chi-squared test for trend Choi-Williams distribution Choleski decomposition Chow test Circadian variation Circular data Circular Distribution Circular Random Variable Classification CART Classification Matrix Class intervals Cluster Analysis Clustered Data Clustering Coefficient Sign Prediction Methods Coincidences Comma Separated Values Common Factor Variance Complete Estimator Component Bar Chart Composite Hypothesis Computational Complexity Computer Virus Computer Vision Concentration Matrix Conditional Probability Confidence Interval Confusion Matrix Continuous Variable Convex Hull Convolution Convolutional Neural Network Correlation Cost Function Covariance Credit Scoring Critical Region Cross Validation Dask Data Analysis Data Analyst Database DBMS Data Consumer Data Engineer Data Engineering (DE) Data Enrichment Dataframe Data Governance Data Journalism Data Lake Data Literacy Data Mining Data Modeling Data Pipeline Data Science Data Scientist Dataset Data Structure Data Visualization Data Warehouse Data Wrangling Decision Tree Deep Learning (DL) Dendrogram Density Based Clustering Design-Based Inference Determinism Diagnostic Key Dickey-Fuller Test Dimensionality Reduction DIP Test Directional Data Disclosure Risk Discrete Fourier Transform Discrete Time Fourier Transform Discrete Uniform Distribution Discriminant Analysis Dispersion Docker Dplyr Dropout Dual System Estimates Dynamic Allocation Indices Dynamic Graphics Dynamic Panel Data Model Dynamic Population Modeling Eberhardt’s Statistic Ecological Fallacy Econometrics EDA Effect Effect Sparcity Efficiency Eigenvector ELT EMalgorithm Ensemble Methods Empirical Empirical Distribution Function Empirical Likelihood Empirical Logits Endpoint Entropy Environmental Statistics Epstein Test Erlang Distribution Error Rate Estimating Functions Estimation ETL Evaluation Metrics Excel Exploratory Data Analysis Extrapolation Factor Factor Analysis Fair Game False Positive False Negative F-distribution Feature Feature Engineering Feature Selection Fast Fourier Transform F-score Fibonacci Distribution Finite State Machine Fitted Value Five-number Summary Flowchart Folium Forecast Fourier Series Fractal Gabor Regression Game Theory Gap Time GAUSS Gausian Process Gauss- Markov Theorem Gauss- Newton Method Gene Frequency Estimation Gene Mapping Generalized Additive Models Generalized Linear Mixed Models Generalized Method of Moments Generalized Multinomial Distribution Generalized p-values Genetic Algorithms Genomics Genomic Ranges Genotype Geographical Correlations GAN Geostatistics GIS Ggmap ggplot2 Google Colab Goldfeld-Quandt Test GPU Gradient Descent Growth Curve Analysis Hadoop Half-mode Half-normal Plot Hanging Rootogram Hankel Matrix Halo Effect Harmonic Mean Harris and Stevens Forecasting Hash Table Hat Matrix Helmert Contrast Heuristic Computer Program Heywood Cases Heterogeneous Heteroscedasticity Hidden Markov Models Hidden Time Effects High Breakdown Methods High Dimensional Data High Throughput Data Hierarchical Likelihood Hierarchical Models Histogram Hit Rate Homogeneous Hosmer-Lemeshow test Hot deck Household Interview Surveys Human Capital Model Hurdle Model Huynh-Feldt correction Hyperbolic distributions Hyperparameter Hyperparameter Tuning Hyperplane Hypothesis Hypothesis Testing Idempotent Matrix Identity Matrix Ignorability Immigration-emigration models Imperfect Detectability Imprecise Probabilities Improper Prior Distribution Imputation Incidence Incidental Parameter Problem Inclusion and Exclusion Criteria Incubation Period Independence Independent component analysis (ICA) Index number Index plot Indicator variable Indirect least squares Indirect standardization Individual differences scaling Infectious period Inference Influence statistics Influence Influential observation Information theory Informative censoring Informative prior Infrastructure as a Service (IaaS) Institutional Surveys Intention-to-treat analysis Interaction Intercept Intercropping experiments Interim analyses Interior analysis Interpolation Interpretability Interquartile range Interrupted time series design Interruptible designs Interval-censored observations Intervention Analysis In Time Series Interviewer bias Intraclass contingency table Intrinsic error Invariance Inverse Bernoulli sampling Irreducible chain Isobologram Isotonic regression Item non-response Item-response theory Item-total correlation Iterated bootstrap Iterated conditional modes algorithm (ICM) Iteration Iteratively reweighted least squares (IRLS) Iterative proportional fitting Jaccard coefficient Jackknife James-Stein estimators Jeffreys’s prior Jelinski-Moranda model Jittered sampling Jittering Joint distribution Jolly-Sebermodel Jonckheere’s k-sample test Jonckheere-Terpstra test J-shaped distribution Jupyter Notebook k1,k2 -design Kaggle Kaiser’s rule Kalman filter Kappa coefficient Karber method Khinchin theorem Kleiner-Hartigan trees Klotz test K-Means K-means inverse regression Knowledge discovery in data bases KDD K-Nearest Neighbors (KNN) Kriging Kruskal-Wallis test Kubernetes (k8s) Kurtosis Lagging indicators Lagrange multipliers Lancaster Models Landmark analysis Landmark-based shape analysis Laplace approximation Large Sample Methods Lasso Latent class analysis Latent class identifiability display Latent period Latent root distributions Latent Variable Latin square Lattice designs Lattice distribution Lattice Path Law of large numbers Law of likelihood LDU test Lead Time Bias Leaps-and-bounds algorithm Least absolute deviation regression Least significant difference test Least squares cross-validation Least squares estimation Length-biased data Length-biased sampling Lenth’s method Levene test Leverage points Lévy process Lexian distributions Lexicostatistics Lexis diagram Life table analysis Likelihood Likelihood distance test Likelihood Principle Likert scales Lilliefors test Lindley’s paradox Linear Algebra Linear-by-linear association test Linear estimator Linearizing Linear model Linear Regression Linear transformation Line-intersect sampling Linkage analysis Linked Micromap Plot Local dependence function Locally weighted regression Logarithmic transformation Log Loss Logistic Regression Log-linear models Logrank test Lomb periodogram Long short-term memory (LSTM) Longitudinal Data Long memory processes Long-range dependence L-statistic Lyapunov exponent Lynden-Bell method Machine Learning Mahout Main Effect Mainframes Majority rule Manhattan distance Manifest variable Mann-Whitney test MANOVA Mantel-Haenszel estimator Many-outlier detection procedures Mardia’s multivariate normality test Maple MapReduce Marginal distribution Marginal homogeneity Marginal structural model (MSM) Market Basket Analysis Markov chain Masking Matched pairs Matching Matching coefficient Matching distribution Mathisen’s test Mathematica Matlab Matplotlib Mauchly test Maximum a posteriori estimate (MAP) Maximum Likelihood Estimation McCabe-Tremayne test Mean Mean and dispersion additive model (MADAM) Mean Absolute Error (MAE) Mean Deviation Mean-range plot Mean Squared Error (MSE) Mean square ratio Mean squares Mean vector Measurement error Measures Of Association Median Median effective dose Median survival time Medical audit MEDLINE Mega-trial Meta-analysis Meta regression Method of moments Meiotic mapping Mendelian randomization Microarrays Microdata Mid P-value Mid-range Midvariance Minimax rule Minimum aberration criterion Minimum average variance estimation (MAVE) method Minimum chi-squared estimator Minimum distance probability (MDP) Minimum volume ellipsoid MINITAB Mirror-match bootstrapping Misclassification error Mis-interpretation of P-values Missing values Misspecification Mixed-effects logistic regression Mixture transition distribution model MIS ML as a Service (MLaaS) MLOps Mode Model Model-based inference Model building Model Drift Model Evaluation Model Monitoring Mojena’s test Monotonic regression Monotonic sequence Monte Carlo maximum likelihood (MCML) Monte Carlo methods Monty Hall Problem Mood’s test Morbidity Mosaic displays Moving-average Mplus Multicentre study Multicollinearity Multidimensional scaling (MDS) Multiepisode models Multilevel models Multimodal distribution Multinomial coefficient Multinomial distribution Multinomial logistic regression Multiple-frame surveys Multiple imputation Multiple time response data Multiple time series Multistate models Multitaper spectral estimators Multivariate Analysis Multivariate Bartlett test Multivariate counting process Multivariate data Multivariate hypergeometric distribution Multivariate Modeling Multivariate Regression Multivariate ZIP model (MZIP) Mutually exclusive events Nadaraya-Watson estimator Naive Bayes Naor’s distribution Natural Language Processing (NLP) Nested case-control study Nested design Network Network sampling Newman-Keuls test Newton–Raphson method N of 1 clinical trial No free lunch theorem Noise Nominal significance level Nominal Variable Nomogram Noncentral distributions Nondifferential measurement error Non-Gaussian time series Non-informative censoring Non-informative prior distribution Nonlinear mapping (NLM) Nonlinear model Non-metric scaling Non-negative garrotte Non-parametric Bayesian models Non-parametric maximum likelihood(NPML) Non-parametric analysis of covariance Non-orthogonal designs Non-randomized clinical trial Non-response Normal approximation Normal Distribution Normality Normal scores NORMIX nQuery advisor Nuisance parameter NoSQL Null distribution Null Hypothesis Null matrix Null vector Number needed to treat Number numbness Nyquist frequency Numerical integration Numpy Oblique factors O’Brien-Fleming method O’Brien’s two-sample tests Observational study Obuchowski and Rockette method Occam’s razor Occam’s window Occupancy problems Odds Offset Ogive One Hot Encoding One-sided test One Shot Learning Open label study Open Source Operational research Opinion survey Optimization methods Oracle property Ordered alternative hypothesis Ordinal Variable Ordination Orthant probability Orthogonal Orthogonal matrix Outcome Outlier Overfitting Pandas Parameter Pattern Recognition Pearson correlation coefficient Platform as a Service (PaaS) Plotly Polynomial Regression Precision Principal Component Analysis (PCA) Probability Probability space PySpark Python PyTorch Quartile R Random Forest Random variable Random walk Recall Regression Regression analysis Regularization Reinforcement Learning (RL) Relational Database Resampling Residual Root Mean Squared Error (RMSE) Rotational invariance Sampling Error Scala SciPy Seaborn Shiny Skewness Sklearn Software as a Service(SaaS) spaCy SQL Standard Deviation Stochastic Supervised Learning SVM Synthetic Data Target Variable TensorFlow Test Set Time Series Training Set Tree diagrams True Negative (TN) True Positive (TP) Underfitting Univariate Modeling Unstructured Data Unsupervised Learning Variance Web Scraping XGBoost Z-Score Z-Test January 24, 2023 | abdoiiii