I'm currently using DETR for object detection. I want to convert it as follows:
pytorch -> onnx -> tensorrt I have the code to do so and tested the model achieving the same performance in all formats. the thing is, the model is in fp32 and when I convert it to fp16 I lose a lot of performance. My idea is to convert some layers to fp16 and leave the rest as fp32 to keep as much accuracy.
my question is. how to convert specific layers of the tensorrt model into fp16? I couldn't find any documentation on this. any and all help is appreciated.
infer using mixed precision in tensorrt
147 Views Asked by Faisal Hejary At
0
There are 0 best solutions below
Related Questions in ONNX
- Stable Diffusion pipe always outputs 512*512 images regardless of the input resolution
- onnx runtime web run onnx, when enable gpu, cannot use dynamic input shape
- How to call onnx in onnx runtime web with dynamic input shape(ignoring input shape check)
- Device_map not wokring for ORTModelForSeq2SeqLM - Potential bug?
- Is dynamic axes configuration incorrect or converting to Torch Script required while converting the following Pytorch model to ONNX format?
- How to convert a python custom model class that wraps a scikit-learn pipeline containing a classifier to an onnx model?
- How to converting GIT (ImageToText / image captioner ) model to ONNX format
- When call onnx model, how to convert image file to correct model input
- Merging 6 ONNX Models into One for Unity Barracuda
- How can i fix a "TypeError: 'BatchEncoding' object is not an iterator" error
- finding the input size for detectron2 model to convert to onnx
- python - How can I retrain an ONNX model?
- Inference speed problem even if using a high-end Hardware
- ONNX export of Seq2Seq model - issue with decoder input length
- Pytorch model converted to Onnx Inference issue
Related Questions in TENSORRT
- Tensorflow can't find TensoRT
- tensorrt inference problem: CPU memory leak
- Device_map not wokring for ORTModelForSeq2SeqLM - Potential bug?
- Running BatchedNMSDynamic_TRT plugin on C++ gives all 0 output
- [ LINUX ]Tensorflow-GPU not working - TF-TRT Warning: Could not find TensorRT
- ammo.torch.quantization TypeError: sum() received an invalid combination of arguments
- Torch cannot find cudnn_adv_train64_8.dll while building Tensor RT Engine for trt-llm-rag-windows
- Is there a library to convert triton server model config to json
- Where should I go to find the TensorRT-llm/benchmark .json data of baichuan2-7b?
- Can I make a Huggingface trainer work with an Intel GPU?
- AttributeError: 'RecursiveScriptModule' object has no attribute 'config' when use HF pipeline with TensorRT model
- Use torch-tensorrt with diffusers library
- Inference speed isn't improved with tensor-rt compared to regular cuda
- Why TF-TRT converter didn't work for my model?
- tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Related Questions in TENSORRT-PYTHON
- Tensorflow can't find TensoRT
- AttributeError: 'RecursiveScriptModule' object has no attribute 'config' when use HF pipeline with TensorRT model
- Inference speed isn't improved with tensor-rt compared to regular cuda
- Installation Failure with Stream Diffusion
- Throwing error for Tensorflow-TensorRT inference model
- infer using mixed precision in tensorrt
- How to convert torchvision MaskRCNN model to TensorRT?
- importing tensorrt gives module not found error
- Can't drive the nvidia GPU on Ubuntu server, finally Skipping registering GPU devices
- Load tensorrt model without tensorflow
- Convert a tensor RT engine file back to source Onnx file, or pytorch model wights
- Modules lost upgrading to python 3.11
- cuMemcpyHtoDAsync failed: invalid argument by using TensorRT (Python)
- Tensor RT installation
- TRT inference using onnx - Error Code 1: Cuda Driver (invalid resource handle)
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular # Hahtags
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?