Can anyone explain the pros and cons to using Data.Text
and Data.ByteString.Char8
data types? Does working with ASCII-only text change these pros and cons? Do their lazy variants change the story as well?
Data.Text vs Data.ByteString.Char8
3k Views Asked by Thomas Eding At
1
There are 1 best solutions below
Related Questions in HASKELL
- "Access Denied" - User's Permissions to S3 Bucket
- Cohort analysis with Amazon Redshift / PostgreSQL
- Using Amazon KMS service on Heroku
- can't ssh in after cloning an EC2 instance on Amazon AWS
- Using HDFS with Apache Spark on Amazon EC2
- How can I access Mule ESB Community edition via browser?
- AWS EC2: Migrating from Windows to Linux Server
- AWS ELB Load Balancer: is it possible to set multiple session cookies?
- AWS Flow Framework: Can we run activity worker and activity task on different EC2 instances
- Unable to access files from public s3 bucket with boto
Related Questions in TEXT
- "Access Denied" - User's Permissions to S3 Bucket
- Cohort analysis with Amazon Redshift / PostgreSQL
- Using Amazon KMS service on Heroku
- can't ssh in after cloning an EC2 instance on Amazon AWS
- Using HDFS with Apache Spark on Amazon EC2
- How can I access Mule ESB Community edition via browser?
- AWS EC2: Migrating from Windows to Linux Server
- AWS ELB Load Balancer: is it possible to set multiple session cookies?
- AWS Flow Framework: Can we run activity worker and activity task on different EC2 instances
- Unable to access files from public s3 bucket with boto
Related Questions in BYTESTRING
- "Access Denied" - User's Permissions to S3 Bucket
- Cohort analysis with Amazon Redshift / PostgreSQL
- Using Amazon KMS service on Heroku
- can't ssh in after cloning an EC2 instance on Amazon AWS
- Using HDFS with Apache Spark on Amazon EC2
- How can I access Mule ESB Community edition via browser?
- AWS EC2: Migrating from Windows to Linux Server
- AWS ELB Load Balancer: is it possible to set multiple session cookies?
- AWS Flow Framework: Can we run activity worker and activity task on different EC2 instances
- Unable to access files from public s3 bucket with boto
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular # Hahtags
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Data.ByteString.Char8
provides functions to treatByteString
values as sequences of 8-bit ASCII characters, whileData.Text
is an independent type supporting the entirety of Unicode.ByteString
andText
are essentially the same, as far as representation goes — strict, unboxed arrays with lazy variants based on lists of strict chunks. The main difference is thatByteString
stores octets (i.e.Word8
s), whileText
storesChar
s, encoded in UTF-16.If you're working with ASCII-only text, then using
Data.ByteString.Char8
will probably be faster thanText
, and use less memory; however, you should ask yourself whether you're really sure that you're only ever going to work with ASCII. Basically, in 99% of cases, usingData.ByteString.Char8
overText
is a speed hack — octets aren't characters, and any Haskeller can agree that using the correct type should be prioritised over raw, bare-metal speed. You should usually only consider it if you've profiled the program and it's a bottleneck.Text
is well-optimised, and the difference will probably be negligible in most cases.Of course, there are non-speed-related situations in which
Data.ByteString.Char8
is warranted. Consider a file containing data that is essentially binary, not text, but separated into lines; usinglines
is completely reasonable. Additionally, it's entirely conceivable that an integer might be encoded in ASCII decimal in the context of a binary format; usingreadInt
would make perfect sense in that case.So, basically:
Data.ByteString.Char8
: For pure ASCII situations where performance is paramount, and to handle "almost-binary" data that has some ASCII components.Data.Text
: Text, including any situation where there's the slightest possibility of something other than ASCII being used.