This question comes mainly from curiosity. I’m not quite sure how to phrase it best. Especially in a title. But I’m wondering if say you have one thread writing to a variable of an essentially primitive type and one thread reading them at the same time if there’s any likelihood of the read happening while the variable is half written causing either weird values or undefined behavior.

Take something like a value of 8 bits from 00010101 to 11101000.

I’m imagining if say 4 bits are written while we try to read it the result could be something like

11100101

To play around i made this small sample rust. It passed without making garbage. Printing at first a bunch of lines stating “String = Hello!” and second “String = Hi!” without weirdness or issues. I kind of half-expected something like “String = #æé¼¨A” or a segfault.

use std::thread::{self, JoinHandle, sleep};

const HELLO: &str = "Hello!";
const HI: &str = "Hi!";

struct ExemptSyncStringSlice<'a>(&'a str);

unsafe impl Sync for ExemptSyncStringSlice<'_> {}

fn print_ptr(pointer: *const ExemptSyncStringSlice)
{
	for _ in 1..500
	{
		unsafe
		{
			println!("String = {}", (*pointer).0);
		}
	}
}

fn main()
{
	
	static mut DESYNC_POINTER: ExemptSyncStringSlice = ExemptSyncStringSlice(HELLO);

	let join_handle: JoinHandle<()> = thread::spawn
	(
		|| {
			print_ptr(&raw const DESYNC_POINTER);
		}
	);
	sleep(time::Duration::from_millis(1));
	unsafe { DESYNC_POINTER.0 = HI; }
	
	join_handle.join().unwrap();
}
  • Killercat103@slrpnk.netOP
    link
    fedilink
    arrow-up
    2
    ·
    8 days ago

    Hm. Good to know. As far as i know. rusts unsafe is kind of like sudo where more tools are granted to you with less checks and balances. Those checks being the borrow checker. So I think unsafe is more like saying “I don’t want the extra protection against bad memory use.” (Like say using memory you already freed). Now im not familiar with what makes a variable atomic or not. But assuming if a variable can be read or written with a single instruction. I’m interpereting that as making it safe?

    • e0qdk@reddthat.com
      link
      fedilink
      arrow-up
      4
      ·
      8 days ago

      Whether something’s atomic or not depends on the language you use – and if the language was vague about it (like old C) then also how the CPU works.

      At the CPU instruction level, there are other factors like how an instruction interacts with memory. Go look up CMPXCHG (from x86) for example, if you want to go down the rabbit hole. There’s a StackOverflow answer here that you might find interesting about using that instruction in combination with the LOCK prefix.

      At the language level, there are usually either guarantees (or a lack of guarantees…) about what is safe to do. C++11 (and later) have std::atomic for defining variables that are accessible from multiple threads safely without manually using a mutex, for example. You generally cannot assume that writing to a variable will be safe between threads otherwise (even if you think the operation should compile to a single CPU instruction) without using a supported concurrency mechanism that the compiler knows how to deal with. Consider the case where the compiler chooses to store a value in a register during a loop as an optimization and only write the value back to RAM at the end of the loop – while that value is changed in RAM by another thread! If you use an atomic variable or protect access with a mutex, then the program will behave coherently. If not, you can end up with inconsistent state between threads and then who the fuck knows what will happen. This SA answer might also be interesting to you.

      In Python (specifically the cpython implementation), there’s the Global Interpreter Lock (GIL). Some things are safe to do there in that language implementation that aren’t safe to do in C because of the GIL. (You still generally shouldn’t depend on them though since people are trying to remove the GIL to get better performance out of Python…) Basically, cpython guarantees that only one thread can run Python byte code at a time so interactions are serialized at that level even if the OS swaps threads in the middle of cpython computing the behavior of an instruction.

      Hope that helps a bit.