I'm looking for a way around the lack-of-polonius problem in this specific circumstance. The other answers seem inapplicable, as far as I can understand at the moment.
I have two structures, SourceBytes<S> and SourceChars. The former is decoupled, but the second is heavily coupled to the former. SourceBytes<S> should be constructed from any S: Iterator<Item = u8>, and SourceChars should be constructed from the same, S: Iterator<Item = u8>.
This is what the definition looks like for each:
#[derive(Clone, Debug)]
pub struct SourceBytes<S>
where
S: Iterator<Item = u8>,
{
iter: S,
buffer: Vec<S::Item>,
}
#[derive(Clone, Debug)]
pub struct SourceChars<S>(S)
where
S: Iterator<Item = u8>;
The purpose of SourceBytes<S> is to abstract over S so that each S::Item can be buffered, and be read immutably without taking/popping the item from the iterator. That looks like this:
impl<S> Iterator for SourceBytes<S>
where
S: Iterator<Item = u8>,
{
type Item = S::Item;
fn next(&mut self) -> Option<Self::Item> {
self.buffer.pop().or_else(|| self.iter.next())
}
}
This works fine, and the buffer is handled like so:
impl<S> SourceBytes<S>
where
S: Iterator<Item = u8>,
{
// pub fn new<I>(iter: I) -> Self
// where
// I: IntoIterator<Item = S::Item, IntoIter = S>,
// {
// Self {
// iter: iter.into_iter(),
// buffer: Vec::new(),
// }
// }
fn buffer(&mut self, count: usize) -> Option<&[u8]> {
if self.buffer.len() < count {
self.buffer
.extend(self.iter.by_ref().take(count - self.buffer.len()));
}
self.buffer.get(0..count)
}
}
So that each time SourceBytes<S>::buffer is called, the items will be taken from S and pushed to buffer. Each time <SourceBytes as Iterator>::next is called, it will first take from self.buffer, and then from self.iter where the type of the latter field is S.
Now, the purpose of SourceChars<S> is provide an Iterator interface to read bytes from self.0 (which is S) until it finds a valid UTF-8 char, and then return it:
impl<S> Iterator for SourceChars<S>
where
S: Iterator<Item = u8>,
{
type Item = char;
fn next(&mut self) -> Option<Self::Item> {
let mut buf = [0; 4];
// A single character can be at most 4 bytes.
for (i, byte) in self.0.by_ref().take(4).enumerate() {
buf[i] = byte;
if let Ok(slice) = std::str::from_utf8(&buf[..=i]) {
return slice.chars().next();
}
}
None
}
}
This also works fine.
Now, I also wish to provide an impl for SourceChars<&mut SourceBytes<S>>, so that SourceChars can rely on the buffer provided by self.0 (which, in this circumstance, is &mut SourceBytes<S>).
impl<S> SourceChars<&mut SourceBytes<S>>
where
S: Iterator<Item = u8>,
{
fn buffer(&mut self, count: usize) -> Option<&str> {
// let mut src = self.0.by_ref();
for byte_count in 0.. {
let Some(buf) = self.0.buffer(byte_count) else {
return None;
};
if let Ok(slice) = std::str::from_utf8(buf) {
if slice.chars().count() >= count {
return Some(slice);
}
}
}
unreachable!()
}
}
This SourceChars<&mut SourceBytes<S>>::buffer relies on SourceBytes<S>::buffer to actually buffer the bytes, but instead SourceChars behaves as a wrapper to change the interpretation of the iterator S from bytes to chars.
The problem is that self.0 cannot be borrowed mutably more than once, and in the loop, the reference &mut self.0 does not appear to be dropped by the compiler.
How can I implement this in such a way that SourceChars relies on SourceBytes::buffer without running into this compiler error?
error[E0499]: cannot borrow `*self.0` as mutable more than once at a time
--> src/parser/iter.rs:122:29
|
119 | fn buffer(&mut self, count: usize) -> Option<&str> {
| - let's call the lifetime of this reference `'1`
...
122 | let Some(buf) = self.0.buffer(byte_count) else {
| ^^^^^^ `*self.0` was mutably borrowed here in the previous iteration of the loop
...
127 | return Some(slice);
| ----------- returning this value requires that `*self.0` is borrowed for `'1`
One option that I previously tried was the crate
polonius-the-crab, but that ended up causing more problems with the usage of the API, in addition to making trait bounds difficult to get right.Because of this inconvenience, I ended up using an unsafe pointer coercion to reduce the lifetime of the
bufto no longer be dependent upon the&mut SourceBytes.Additionally, here are the tests that show usage of the API. Using the
polonius-the-crabcrate failed to solve some lifetime issues that I ran across while implementing these tests.