sqlite 到底是不是线程安全

背景

几个项目都用sqlite3,可能出现多个线程访问,项目自己加了锁,但我不知道到底sqlite到底是不是线程安全,于是网上搜索很多,都是翻译官方文档,具体也没有说清楚,于是今天自己搜索一下内容,不过搜索国外也没有很好解析,自己结合官网的文章和stackoverflow简单总结了一下。
下面以C++ SQLITE 来说明介绍

sqlite 线程安全设置

通过宏定义SQLITE_THREADSAFE是否线程安全,官方文档window与linux默认编译动态库是SQLITE_THREADSAFE = 1,代表线程安全,基本所有文档都这么说,但线程安全又分为3种模式,看这里就懵逼了,线程是否安全不就是加锁与不加锁,怎么还出现3种模式,我开始也不明白,后面看到官方介绍,自己细细分析区别就是加锁的程度不一样,因为SQL有很多函数,只是加锁函数个数不一样,比喻100个函数,如果模式0, 单线程那么就不用加锁,你都只有一个函数使用,这种场景其实也有,我就在主线程处理数据库操作,那么是不是就只有一个线程,那么有什么线程安全呢?所以100个函数都不进行加锁,感觉这个直接设置SQLITE_THREADSAFE = 0 不就可以了吗?但SQLITE 又提出这个概念,那么我在想可能一些底层访问还是加锁,只是我们用的API没有加锁而已(这里只是我猜测,没有看代码分析)

模式1 多线程模式,部分函数加锁,但一些函数不加锁,这种要一个线程一个连接,不能多个线程拿多个连接,可能出现线程竞争,这块具体哪些API,我目前还没有明白。

模式2 串行化模式,全部加锁,所以官方文档说这种没有限制的。默认采用这种模式

官方结论

https://www.sqlite.org/faq.html#q6

Threads are evil. Avoid them.

SQLite is threadsafe. We make this concession since many users choose to ignore the advice given in the previous paragraph. But in order to be thread-safe, SQLite must be compiled with the SQLITE_THREADSAFE preprocessor macro set to 1. Both the Windows and Linux precompiled binaries in the distribution are compiled this way. If you are unsure if the SQLite library you are linking against is compiled to be threadsafe you can call the sqlite3_threadsafe() interface to find out.

SQLite is threadsafe because it uses mutexes to serialize access to common data structures. However, the work of acquiring and releasing these mutexes will slow SQLite down slightly. Hence, if you do not need SQLite to be threadsafe, you should disable the mutexes for maximum performance. See the threading mode documentation for additional information.

Under Unix, you should not carry an open SQLite database across a fork() system call into the child process.

结论是线程安全,但必须要设置SQLITE_THREADSAFE=1,windows 与 linux 编译二进制文件默认设置thread-safe,
所以是线程安全,如果你考虑性能可以不要设置线程安全,因为这样子不用加锁,速度会快一些。但客户端又没有那么大高平凡需求,或者你只在一个线程使用,那么可以考虑使用非线程安全。

但这里设置线程安全,也要考虑线程模式,差别比较大,根据我上面思路,可以验证。

线程模式

https://www.sqlite.org/threadsafe.html

SQLite supports three different threading modes:

Single-thread. In this mode, all mutexes are disabled and SQLite is unsafe to use in more than a single thread at once.

Multi-thread. In this mode, SQLite can be safely used by multiple threads provided that no single database connection is used simultaneously in two or more threads.

Serialized. In serialized mode, SQLite can be safely used by multiple threads with no restriction.

The threading mode can be selected at compile-time (when the SQLite library is being compiled from source code) or at start-time (when the application that intends to use SQLite is initializing) or at run-time (when a new SQLite database connection is being created). Generally speaking, run-time overrides start-time and start-time overrides compile-time. Except, single-thread mode cannot be overridden once selected.

The default mode is serialized.

单线程模式

超过一个线程就是不安全,锁无效。

多线程模式

一个连接不能在多线程使用,因为一些函数不是安全,只是加锁颗粒变小,所以注意。具体官方文档有说明,看下面我会说

串行话模式

可以随便使用

线程模式官方更详细介绍

https://www.sqlite.org/c3ref/c_config_covering_index_scan.html#sqliteconfigsinglethread

SQLITE_CONFIG_SINGLETHREAD
There are no arguments to this option. This option sets the threading mode to Single-thread. In other words, it disables all mutexing and puts SQLite into a mode where it can only be used by a single thread. If SQLite is compiled with the SQLITE_THREADSAFE=0 compile-time option then it is not possible to change the threading mode from its default value of Single-thread and so sqlite3_config() will return SQLITE_ERROR if called with the SQLITE_CONFIG_SINGLETHREAD configuration option.

单线程模式:所有锁无效,如果你编译时候 SQLITE_THREADSAFE=0,就是默认这个模式,运行时候你无法动态改变线程模式,如果你设置就会报错。但如果线程安全那么是不是可以设置某个链接为了非线程安全(感觉可以)

SQLITE_CONFIG_MULTITHREAD
There are no arguments to this option. This option sets the threading mode to Multi-thread. In other words, it disables mutexing on database connection and prepared statement objects. The application is responsible for serializing access to database connections and prepared statements. But other mutexes are enabled so that SQLite will be safe to use in a multi-threaded environment as long as no two threads attempt to use the same database connection at the same time. If SQLite is compiled with the SQLITE_THREADSAFE=0 compile-time option then it is not possible to set the Multi-thread threading mode and sqlite3_config() will return SQLITE_ERROR if called with the SQLITE_CONFIG_MULTITHREAD configuration option.

设置多线程模式,数据库链接对象与预编译对象锁是关闭的,自己要保存链接与预编译是安全的。不能同一个时刻使用同一个链接,具体怎么不能使用,我暂时不能理解。。跟下面贴链接,大概就是同一个同时时刻执行不能执行冲突的语句。不然会竞争。
这里就有2种解决方案

  1. 队列【每个线程一个链接】,以任务形式丢到任务处理函数,这样子并发的。
  2. 线程池,每次拿到一个链接,执行完,放回连接池,这样子就不用管了,一定能保存,一个线程同一时刻使用到,主流还是连接池模式。
    这种多线程模式,性能比非安全低,因为部分有锁,但比串行又性能又好,毕竟并发的。如果对并发要求高,那么只能这种模式【客户端有这么多数据要处理吗?????我目前还是怀疑态度】

https://dev.yorhel.nl/doc/sqlaccess
貌似这篇文档有说一些解决方案

There are no arguments to this option. This option sets the threading mode to Serialized. In other words, this option enables all mutexes including the recursive mutexes on database connection and prepared statement objects. In this mode (which is the default when SQLite is compiled with SQLITE_THREADSAFE=1) the SQLite library will itself serialize access to database connections and prepared statements so that the application is free to use the same database connection or the same prepared statement in different threads at the same time. If SQLite is compiled with the SQLITE_THREADSAFE=0 compile-time option then it is not possible to set the Serialized threading mode and sqlite3_config() will return SQLITE_ERROR if called with the SQLITE_CONFIG_SERIALIZED configuration option.

可以放心在多个线程使用,给你锁的好好的,锁的颗粒最大。

查看线程模式

调用 sqlite3_threadsafe

我的总结

  1. 如果你设置SQLITE_THREADSAFE=1,并设置串行模式,那么就是线程安全的,默认windows与Linux官方的提供的包都是线程安全的。
  2. SQLITE_THREADSAFE = 1并设置多线程模式,必须保持同一时刻不能多线程处理同一个链接,所以解决方案是连接池,或者任务队列模式,来保存同一个时刻一个连接只由一个线程访问(我的理解,其实这个就不是线程安全,都要自己保存安全了,那还谈什么线程安全,我们常见线程安全就是直接调用,不要考虑不同线程),所以这种模式只是性能好一点,但线程安全还是由于把握。
  3. 串行模式,真正意义以上线程安全。你随便乱用都可以。
    模式只是锁的颗粒大小而已,但真正意义线程安全必须串行模式。大部分人就用串行就可以了,对并发要求高的就用多线程模式。如果只有一个线程使用的话,那么直接单线程模式(为什么要单独提出来就是因为默认线程安全模式,是有锁,所以必须你自己设置或者编译设置等等)

我这篇文章应该足够说明SQLITE 多线程问题了吧,感觉有的时候根本就不需要看代码,看代码反而更加搞不懂,因为别人的写的代码,不一定能映射到他的思维。直接看帮助文档和网上一些总结和自己开发经验就能推出来一些合理结论。。。。。

其他语言逻辑应该是一样的。。。。

node js 多线程

一、前提

我以前只认为node js 只能单线程,我自己用了一段时间electron 桌面开发,如果遇到CPU密集型业务怎么做呢?我不可能只靠一个CPU来搞业务啊,这个限制不是太大了?

二、解决方法

1:多进程

child_process

等模块创建多进程

这种打补丁方法不是很好,涉及进程通信,写代码效率不高,很容易出现bug。开发者很可能出现变量无法访问的问题,因为多个进程,变量没有共享。不过业务写很分开,也问题不大,多进程资源比较大。

2:多线程

worker_threads

真正用的是多线程,我自己写代码测试,不停创建worker_threads..

测试代码

const { Worker, isMainThread, parentPort } = require('worker_threads');
console.log("hello world:" + isMainThread)

/*if (isMainThread) {
  // This code is executed in the main thread and not in the worker.
  
  // Create the worker.
  const worker = new Worker(__filename);
  // Listen for messages from the worker and print them.
  worker.on('message', (msg) => { 
      console.log(msg); 
    });
} else {
  // This code is executed in the worker and not in the main thread.
  
  // Send a message to the main thread.
  parentPort.postMessage('Hello world!');
}*/

function test(){
    if(isMainThread){
      console.log("自动线程哦");
      const worker = new Worker("./work.js");
    }
   
}

setInterval(() => {
  test();
}, 1000);

通过定时器不停创建工程线程。

work.js

const { isMainThread, parentPort,Worker, threadId } = require('worker_threads');



const min = 2
const max = 1e7

function generatePrimes(start, range) {
    let primes = []
    let isPrime = true
    let end = start + range
    for (let i = start; i < end; i++) {
      for (let j = min; j < Math.sqrt(end); j++) {
        if (i !== j && i%j === 0) {
          isPrime = false
          break
        }
      }
      if (isPrime) {
        primes.push(i)
      }
      isPrime = true
    }
    return primes
 }


console.log("线程id:" + threadId)
generatePrimes(1, 10000000)
console.log("线程id:" + threadId + " 完成")

可以通过资源管理器查看线程【默认不显示,自己设置一下】,线程数不停增加

可以不停增加,因为我这个写法,会很快把cpu吃死。。。。如果像以前nodejs 普通写法,最多只能吃一个CPU..


注意在线程里面不要直接访问对象,毕竟node js 模式就是异步+主线程,引入多线程导致更多问题,多线程仅仅执行,然后丢回主线程即可。