I've been trying to move from postgres to tokio_postgres but struggle with some async.
use scraper::Html;
use std::sync::Arc;
use tokio::sync::Mutex;
use tokio::task;
struct Url {}
impl Url {
fn scrapped_home(&self, symbol: String) -> Html {
let url = format!(
"https://finance.yahoo.com/quote/{}?p={}&.tsrc=fin-srch", symbol, symbol
);
let response = reqwest::blocking::get(url).unwrap().text().unwrap();
scraper::Html::parse_document(&response)
}
}
#[derive(Clone)]
struct StockData {
symbol: String,
}
#[tokio::main]
async fn main() {
let stock_data = StockData { symbol: "".to_string() };
let url = Url {};
let mut uri_test: Arc<Mutex<Html>> = Arc::new(Mutex::from(url.scrapped_home(stock_data.clone().symbol)));
let mut uri_test_closure = Arc::clone(&uri_test);
let uri = task::spawn_blocking(|| {
uri_test_closure.lock()
});
}
Without putting a mutex on
url.scrapped_home(stock_data.clone().symbol)),
I would get the error that a runtime cannot drop in a context where blocking is not allowed, so I put in inside spawn_blocking. Then I get the error that Cell cannot be shared between threads safely. This, from what I could gather, is because Cell isn'it Sync. I then wrapped in within a Mutex. This on the other hand throws Cell cannot be shared between threads safely'.
Now, is that because it contains a reference to a Cell and therefore isn't memory-safe? If so, would I need to implement Sync for Html? And how?
Html is from the scraper crate.
UPDATE:
Sorry, here's the error.
error: future cannot be sent between threads safely
--> src/database/queries.rs:141:40
|
141 | let uri = task::spawn_blocking(|| {
| ________________________________________^
142 | | uri_test_closure.lock()
143 | | });
| |_________^ future is not `Send`
|
= help: within `tendril::tendril::NonAtomic`, the trait `Sync` is not implemented for `Cell<usize>`
note: required by a bound in `spawn_blocking`
--> /home/a/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.20.1/src/task/blocking.rs:195:12
|
195 | R: Send + 'static,
| ^^^^ required by this bound in `spawn_blocking`
UPDATE:
Adding Cargo.toml as requested:
[package]
name = "reprod"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
reqwest = { version = "0.11", features = ["json", "blocking"] }
tokio = { version = "1", features = ["full"] }
tokio-postgres = "0"
scraper = "0.12.0"
UPDATE: Added original sync code:
fn main() {
let stock_data = StockData { symbol: "".to_string() };
let url = Url {};
url.scrapped_home(stock_data.clone().symbol);
}
UPDATE: Thanks to Kevin I was able to get it to work. As he pointed out Html was neither Send nor Sync. This part of the Rust lang doc helped me to understand how message passing works.
pub fn scrapped_home(&self, symbol: String) -> Html {
let (tx, rx) = mpsc::channel();
let url = format!(
"https://finance.yahoo.com/quote/{}?p={}&.tsrc=fin-srch", symbol, symbol
);
thread::spawn(move || {
let val = reqwest::blocking::get(url).unwrap().text().unwrap();
tx.send(val).unwrap();
});
scraper::Html::parse_document(&rx.recv().unwrap())
}
Afterwards I had some sort of epiphany and got it to work with tokio, without message passing, as well
pub async fn scrapped_home(&self, symbol: String) -> Html {
let url = format!(
"https://finance.yahoo.com/quote/{}?p={}&.tsrc=fin-srch", symbol, symbol
);
let response = task::spawn_blocking(move || {
reqwest::blocking::get(url).unwrap().text().unwrap()
}).await.unwrap();
scraper::Html::parse_document(&response)
}
I hope that this might help someone.
This illustrates it a bit more clearly now: you're trying to return a
tokio::sync::MutexGuardacross a thread boundary. When you call this:The
uri_test_closure.lock()call (tokio::sync::Mutex::lock()) doesn't have a semicolon, which means it's returning the object that's the result of the call. But you can't return aMutexGuardacross a thread boundary.I suggest you read up on the linked
lock()call, as well asblocking_lock()and such there.I'm not certain of the point of your call to
task::spawn_blockinghere. If you're trying to illustrate a use case for something, that's not coming across.Edit:
The problem is deeper.
Htmlis both!Sendand!Syncwhich means you can't even wrap it up in anArc<Mutex<Html>>orArc<Mutex<Optional<Html>>>or whatever. You need to get the data from another thread in another way, and not as that "whole" object. See this post on the rust user forum for more detailed information. But whatever you're wrapping must beSendand that struct is explicitly not.So if a type is
Sendand!Sync, you can wrap in aMutexand anArc. But if it's!Send, you're hooped, and need to use message passing, or other synchronization mechanisms.