Sometimes performing feature reduction reduces number of features with methods like PCA and then we could scale only the relevant variables. Is there a rule that we need to do normalization/scaling first and then the feature reduction?
Should we always first perform feature normalization and then the feature reduction?
758 Views Asked by Sharat Ainapur At
1
There are 1 best solutions below
Related Questions in MACHINE-LEARNING
- Trained ML model with the camera module is not giving predictions
- Keras similarity calculation. Enumerating distance between two tensors, which indicates as lists
- How to get content of BLOCK types LAYOUT_TITLE, LAYOUT_SECTION_HEADER and LAYOUT_xx in Textract
- How to predict input parameters from target parameter in a machine learning model?
- The training accuracy and the validation accuracy curves are almost parallel to each other. Is the model overfitting?
- ImportError: cannot import name 'HuggingFaceInferenceAPI' from 'llama_index.llms' (unknown location)
- Which library can replace causal_conv1d in machine learning programming?
- Fine-Tuning Large Language Model on PDFs containing Text and Images
- Sketch Guided Text to Image Generation
- My ICNN doesn't seem to work for any n_hidden
- Optuna Hyperband Algorithm Not Following Expected Model Training Scheme
- How can I resolve this error and work smoothly in deep learning?
- ModuleNotFoundError: No module named 'llama_index.node_parser'
- Difference between model.evaluate and metrics.accuracy_score
- Give Bert an input and ask him to predict. In this input, can Bert apply the first word prediction result to all subsequent predictions?
Related Questions in DATA-SCIENCE
- KEDRO - How to specify an arbitrary binary file in catalog.yml?
- Struggling to set up a sparse matrix problem to complete data analysis
- How do I remove slashes and copy the values into many other rows in pandas?
- Downloading full records from Entrez
- Error While calling "from haystack.document_stores import ElasticsearchDocumentStore"
- How to plot time series from 2 columns (Date and Value) by Python google colab?
- How to separate Hijri (Arabic) and Gregorian date ranges from on column to separate columns
- How to wait the fully download of a file with selenium(firefox) in python
- Survey that collects anonymous results, but tracks which recipient have responded
- Dataframe isin function Buffer was wrong number of dimensions error
- How to add different colours in an Altair grouped bar chart in python?
- Python Sorting list of dictionaries with nested list
- Float Division by Zero Error with Function Telling Greatest Power of a Number Dividing Another Number
- If a row contains at least two not NaN values, split the row into two separate ones
- DATA_SOURCE_NOT_FOUND Failed to find data source: mlflow-experiment. Please find packages at `https://spark.apache.org/third-party-projects.html
Related Questions in FEATURE-ENGINEERING
- Time Series Rolling Windows Feature
- turning an Autoencoder into another model
- How to get Feature from Drug's Similarity matrix?
- What is a vectorized way to detect feature drift in python/pandas columns?
- Sklearn: Extract feature names after model fitting with polynomialFeature, onehot encoding and OrdinalEncoder
- Pycaret : Got Missing Value error in target col
- Feature engineering on BERT
- Training feature matrix vs Real input
- How to Integrate TsFresh Feature Extraction Output with Original Time Series in PySpark
- Logistic Regression Deviance Variance Across Numerical and Categorical Variables
- How do I address this generic error message? SparkRuntimeException: [UDF_USER_CODE_ERROR.GENERIC] Execution of function
- BigQuery GLOBAL_EXPLAIN for a sparse feature column
- Tracking feature if recipients read an email (engineering question)
- Choosing Between One-Hot Encoding and Label Encoding for Time Series Forecasting
- How to define features presence in a TensorFlow Data Validation schema?
Related Questions in MACHINE-LEARNING-MODEL
- How to code in Java in to train a xml file with positive and negative examples image for an image recognition model by using HOG Features
- Flask API with TensorFlow Lite model always predicts the same class, regardless of input image
- Model Trainer Issue on End-to-End ML Project - TypeError: initiate_model_training() missing 4 required positional arguments
- Model Trainer Issue on End-to-End ML Project - TypeError: __init__() got an unexpected keyword argument 'config'
- Model Trainer Issue on End-to-End ML Project - TypeError: __init__() got an unexpected keyword argument 'trained_model_file_path'
- "The size of byte buffer and the shape do not match" erro in android studio with TensorFlow lite model
- Unable to Find Option for Exporting Custom ML Model for Object Detection in Google Vision (GCP Vertex AI)
- Wrong type of credentials for creating tuning model in quickstart
- Problem with GLM regressor in reusing it for different input data
- How to detect target object's offset from a reference point?
- Runtime Error while transfer learning a model using learn.fit_one_cycle(32)
- Machine Learning - Random Forest Classification - AxisError Axis 1 is out of bounds for array of dimension 1
- How create blending ensemble properly and call as ussual Libary, how to integrate in below my model
- How can I save an Azure ML sklearn model to a specific blob storage container?
- ModuleNotFoundError: No module named 'segmentation_models'
Related Questions in FEATURE-SCALING
- Feature Scaling with MinMaxScaler()
- R quanteda textplot_network for each document and influence number of features
- Is there any data scaling methods except for Min-Max Normalization and Quantile transformation that keeps the range between [0,1]?
- Getting a negative prediction after min-max scaling the price in a linear regression
- Understanding the Implications of Scaling Test Data Using the Same Scalar Object as Training Data
- Do i need to use RobustScaler() and OneHotEncoder() in new data before model.predict()
- How can I see progress of all Features with a Progress bar
- Does it makes sense to scale features by only one label before using logistic regression?
- Machine Learning: Combining Binary Encoder and RobustScaler
- Strange results when scaling data using scikit learn
- Should we always first perform feature normalization and then the feature reduction?
- Feature rescaling for k-means clustering
- Some columns became NaN after scaling
- Why Does Tree and Ensemble based Algorithm don't need feature scaling?
- Do features need to be scaled in Logistic Regression?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular # Hahtags
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
I would suggest first do your normalization/scaling on your feature data and then performing feature selection. This is because most of the feature selection techniques require a meaningful representation of your data. By normalizing your data your features have the same order of magnitude and scatter, which makes it easier to find which one of those is more relevant.
For example, for PCA the computation is based on the standard deviation (SD) of your features to find the relevant axis of a new projection of your data. If you do not normalize your data, features with a high SD will have a higher weight compared to features with a small SD distorting their relevance when computing the PCA.