凌云的博客

行胜于言

抓虫日志 - 子进程的重定向句柄不关闭

分类:debug| 发布时间:2022-04-06 11:29:00

问题

最近,有同事跟我说他遇到了一个奇怪的问题,在 Windows 下有如下代码:

#include <iostream>
#include <string>
#include <boost/process.hpp>

using namespace std;
namespace bp = boost::process;

void spawn_child(string cmd)
{
    //...
    bp::ipstream is;
	std::error_code ec;
    bp::child ch(cmd, bp::windows::hide, ec, bp::std_out > is);
    string line;
	while (ch.running() && getline(is, line)) {
		cout << line << endl;
	}

	ch.wait();
	cout << "child exited, code=" << ch.exit_code() << endl;
}

当多个线程同时使用 spawn_child 方法创建子进程的时候,有可能会出现子进程已经结束,但是 spawn_child 方法不返回的情况。

比如我的程序有两个线程 t1、t2 分别创建了子进程 c1、c2。有可能会出现 c1 已经结束,但是不返回(阻塞在 getline 函数),一直等到 c2 结束后 t1 的 spawn_child 函数才退出的情况。

原因

由于子进程 c1、c2 是没有关联的,之所以会出现 t1 的 spawn_child 需要等待 c2 结束才能退出的原因是 c2 意外的继承了 c1 的标准输出句柄,导致即使 c1 进程已经退出还需要等待 c2 也退出,t1 的 spawn_child 才能退出。

为了重现这个问题,我写了一个简单的测试程序。如下:

子进程代码:

#include <iostream>
#include <string>
#include <chrono>
#include <thread>

using namespace std;

void usage(const char *program)
{
	cerr << "Usage: " << program << " <secs>" << endl;
	exit(1);
}

int main(int argc, char *argv[])
{
	if (argc < 2) usage(argv[0]);

	auto timeout = stoll(argv[1]);
	for (int i = 0; i < timeout; ++i) {
		cout << "counter: " << i + 1 << endl;
		this_thread::sleep_for(chrono::seconds(1));
	}

	cout << "counter exited" << endl;
    return 0;
}

子进程的代码非常简单,每秒输出一行日志,输出 <secs> 秒后退出程序。

父进程代码:

#include <iostream>
#include <string>
#include <atomic>
#include <chrono>
#include <thread>
#include <Windows.h>

using namespace std;

atomic_bool g_child2_spawnd = false;

void error_exit(string msg)
{
	error_code ec(GetLastError(), system_category());
	cerr << msg << ", err=" << ec.value() << ", " << ec.message() << endl;
	exit(1);
}

void spawn_child(LPSTR cmd, string mark,
	std::function<void()> before_cb = nullptr,
	std::function<void()> after_cb = nullptr)
{
	SECURITY_ATTRIBUTES sa;
	sa.nLength = sizeof(sa);
	sa.lpSecurityDescriptor = NULL;
	sa.bInheritHandle = TRUE;

	HANDLE rh = INVALID_HANDLE_VALUE, wh = INVALID_HANDLE_VALUE;
	if (!CreatePipe(&rh, &wh, &sa, 0)) {
		error_exit("Failed to CreatePipe");
	}


	if (!SetHandleInformation(rh, HANDLE_FLAG_INHERIT, 0)) {
		error_exit("Failed to SetHandleInformation");
	}

	if (before_cb) before_cb();

	STARTUPINFOEXA si;
	memset(&si, 0, sizeof(si));
	si.StartupInfo.hStdOutput = wh;
	si.StartupInfo.dwFlags = STARTF_USESHOWWINDOW | STARTF_USESTDHANDLES;
	si.StartupInfo.wShowWindow = SW_HIDE;
	si.lpAttributeList = NULL;

	PROCESS_INFORMATION pi;
	memset(&pi, 0, sizeof(pi));
	auto rc = CreateProcessA(nullptr, cmd, NULL, NULL, TRUE, EXTENDED_STARTUPINFO_PRESENT,
		NULL, NULL, &si.StartupInfo, &pi);
	if (!rc) error_exit("Failed to CreateProcessA");

	// close write point of redirect handle
	CloseHandle(wh);
	wh = INVALID_HANDLE_VALUE;

	if (after_cb) after_cb();

	char buff[1024];
	DWORD bytes_read;
	while (true) {
		if (!ReadFile(rh, buff, 1024, &bytes_read, NULL)) {
			if (GetLastError() == ERROR_BROKEN_PIPE) break;
			error_exit("Failed to read " + mark);
		}

		if (!bytes_read) break;
		cout << mark << ": " << string(buff, bytes_read) << endl;
	}

	WaitForSingleObject(pi.hProcess, INFINITE);
	DWORD errorcode = 0;
	if (!GetExitCodeProcess(pi.hProcess, &errorcode)) error_exit("Failed to GetExitCodeProcess");
	CloseHandle(pi.hThread);
	CloseHandle(pi.hProcess);
	CloseHandle(rh);
	cout << mark << " exited, code=" << errorcode << endl;
}

void spawn_child1()
{
	spawn_child("childproc 30", "c1", []() {
		// wait for child2 spawnd
		while (!g_child2_spawnd) {
			this_thread::sleep_for(chrono::milliseconds(50));
		}
	});
}

void spawn_child2()
{
	spawn_child("childproc 60", "c2", nullptr, []() {
		g_child2_spawnd = true;
	});
}

int main(int argc, char *argv[])
{
	unique_ptr<thread> th1{ new thread(spawn_child1) };
	unique_ptr<thread> th2{ new thread(spawn_child2) };
	th1->join();
	th2->join();
	cout << "main exited" << endl;
	return 0;
}

为了每次都能重现问题,特意在 t1 中等待 t2 的子进程创建好之后再调用 CreateProcess。

解决方法

既然知道问题所在,那么解决方法就简单了,最直观的方法是在 spawn_child 函数加一个锁, 在 CreateProcess 调用完,并且 wh 句柄关掉之后再释放锁(当然上述测试代码中的 before_cb 和 after_cb 中的代码要去掉)。

如果你只需要支持 Windows Vista 之后的版本,那么还可以通过 STARTUPINFOEXA 的 lpAttributeList 字段指定子进程要继承的句柄来解决,如下:

#include <iostream>
#include <string>
#include <atomic>
#include <chrono>
#include <thread>
#include <Windows.h>

using namespace std;

atomic_bool g_child2_spawned = false;

void error_exit(string msg)
{
	error_code ec(GetLastError(), system_category());
	cerr << msg << ", err=" << ec.value() << ", " << ec.message() << endl;
	exit(1);
}

void spawn_child(LPSTR cmd, string mark,
	std::function<void()> before_cb = nullptr,
	std::function<void()> after_cb = nullptr)
{
	SECURITY_ATTRIBUTES sa;
	sa.nLength = sizeof(sa);
	sa.lpSecurityDescriptor = NULL;
	sa.bInheritHandle = TRUE;

	HANDLE rh = INVALID_HANDLE_VALUE, wh = INVALID_HANDLE_VALUE;
	if (!CreatePipe(&rh, &wh, &sa, 0)) {
		error_exit("Failed to CreatePipe");
	}


	if (!SetHandleInformation(rh, HANDLE_FLAG_INHERIT, 0)) {
		error_exit("Failed to SetHandleInformation");
	}

	if (before_cb) before_cb();

	STARTUPINFOEXA si;
	memset(&si, 0, sizeof(si));
	si.StartupInfo.hStdOutput = wh;
	si.StartupInfo.dwFlags = STARTF_USESHOWWINDOW | STARTF_USESTDHANDLES;
	si.StartupInfo.wShowWindow = SW_HIDE;

	SIZE_T size = 0;
	BOOL rc = InitializeProcThreadAttributeList(NULL, 1, 0, &size) ||
		GetLastError() == ERROR_INSUFFICIENT_BUFFER;
	if (!rc) error_exit("Failed to InitializeProcThreadAttributeList");
	unique_ptr<char[]> attr_buff(new char[size]);
	LPPROC_THREAD_ATTRIBUTE_LIST attribute_list = reinterpret_cast<LPPROC_THREAD_ATTRIBUTE_LIST>(attr_buff.get());
	if (!InitializeProcThreadAttributeList(attribute_list, 1, 0, &size)) error_exit("Failed to InitializeProcThreadAttributeList");
	if (!UpdateProcThreadAttribute(attribute_list, 0, PROC_THREAD_ATTRIBUTE_HANDLE_LIST, &wh, sizeof(HANDLE), NULL, NULL))
		error_exit("Failed to UpdateProcThreadAttribute");
	si.lpAttributeList = attribute_list;

	PROCESS_INFORMATION pi;
	memset(&pi, 0, sizeof(pi));
	rc = CreateProcessA(nullptr, cmd, NULL, NULL, TRUE, EXTENDED_STARTUPINFO_PRESENT,
		NULL, NULL, &si.StartupInfo, &pi);
	if (!rc) error_exit("Failed to CreateProcessA");

	// close write point of redirect handle
	CloseHandle(wh);
	wh = INVALID_HANDLE_VALUE;
	DeleteProcThreadAttributeList(attribute_list);

	if (after_cb) after_cb();

	char buff[1024];
	DWORD bytes_read;
	while (true) {
		if (!ReadFile(rh, buff, 1024, &bytes_read, NULL)) {
			if (GetLastError() == ERROR_BROKEN_PIPE) break;
			error_exit("Failed to read " + mark);
		}

		if (!bytes_read) break;
		cout << mark << ": " << string(buff, bytes_read) << endl;
	}

	WaitForSingleObject(pi.hProcess, INFINITE);
	DWORD errorcode = 0;
	if (!GetExitCodeProcess(pi.hProcess, &errorcode)) error_exit("Failed to GetExitCodeProcess");
	CloseHandle(pi.hThread);
	CloseHandle(pi.hProcess);
	CloseHandle(rh);
	cout << mark << " exited, code=" << errorcode << endl;
}

void spawn_child1()
{
	spawn_child("childproc 30", "c1", []() {
		// wait for child2 spawned
		while (!g_child2_spawned) {
			this_thread::sleep_for(chrono::milliseconds(50));
		}
	});
}

void spawn_child2()
{
	spawn_child("childproc 60", "c2", nullptr, []() {
		g_child2_spawned = true;
	});
}

int main(int argc, char *argv[])
{
	unique_ptr<thread> th1{ new thread(spawn_child1) };
	unique_ptr<thread> th2{ new thread(spawn_child2) };
	th1->join();
	th2->join();
	cout << "main exited" << endl;
	return 0;
}

boost::process 解决方法

上面描述了用 Windows API 创建子进程的时候如何避免子进程如何意外的继承了其他子进程的句柄,那么如果我使用的是 boost::process 的情况,如何解决这个问题呢?

如下:

#include <iostream>
#include <atomic>
#include <memory>
#include <boost/process.hpp>
#include <boost/process/windows.hpp>
#include <boost/process/extend.hpp>
#include <boost/process/detail/handler_base.hpp>

using namespace std;
namespace bp = boost::process;
using bp::detail::throw_last_error;

atomic_bool g_child2_spawned = false;

struct only_inherted_stdhandle: bp::detail::handler_base {
	mutable unique_ptr<char[]> attr_buff;
	mutable LPPROC_THREAD_ATTRIBUTE_LIST attribute_list = nullptr;

	template <class Executor>
	void on_setup(Executor&e) const {
		HANDLE handles[3];
		int count = 0;
		if (e.startup_info_ex.StartupInfo.hStdInput != INVALID_HANDLE_VALUE) {
			handles[count++] = e.startup_info_ex.StartupInfo.hStdInput;
		}

		if (e.startup_info_ex.StartupInfo.hStdOutput != INVALID_HANDLE_VALUE) {
			handles[count++] = e.startup_info_ex.StartupInfo.hStdOutput;
		}

		if (e.startup_info_ex.StartupInfo.hStdError != INVALID_HANDLE_VALUE) {
			handles[count++] = e.startup_info_ex.StartupInfo.hStdError;
		}

		if (!count) return;
		SIZE_T size = 0;
		BOOL rc = InitializeProcThreadAttributeList(NULL, 1, 0, &size) ||
			GetLastError() == ERROR_INSUFFICIENT_BUFFER;
		if (!rc) bp::detail::throw_last_error("Failed to InitializeProcThreadAttributeList");
		attr_buff.reset(new char[size]);
		attribute_list = reinterpret_cast<LPPROC_THREAD_ATTRIBUTE_LIST>(attr_buff.get());
		if (!InitializeProcThreadAttributeList(attribute_list, 1, 0, &size)) {
			attribute_list = nullptr;
			bp::detail::throw_last_error("Failed to InitializeProcThreadAttributeList");
		}

		if (!UpdateProcThreadAttribute(attribute_list, 0, PROC_THREAD_ATTRIBUTE_HANDLE_LIST,
			handles, count * sizeof(HANDLE), NULL, NULL))
			bp::detail::throw_last_error("Failed to UpdateProcThreadAttribute");
		e.startup_info_ex.lpAttributeList = attribute_list;
	}

	template <class Executor>
	void on_error(Executor&, const std::error_code&) const {
		if (attribute_list) {
			DeleteProcThreadAttributeList(attribute_list);
			attribute_list = nullptr;
		}
	}

	template <class Executor>
	void on_success(Executor&) const {
		if (attribute_list) {
			DeleteProcThreadAttributeList(attribute_list);
			attribute_list = nullptr;
		}
	}
};

void error_exit(const char *msg)
{
	error_code ec(GetLastError(), system_category());
	cerr << msg << ", err=" << ec.value() << ", " << ec.message() << endl;
	exit(1);
}

void spawn_child1()
{
	bp::ipstream is;
	std::error_code ec;
	auto wait_child2_spawned = bp::extend::on_setup([](auto &e) {
		while (!g_child2_spawned) {
			this_thread::sleep_for(chrono::milliseconds(50));
		}
	});
	bp::child ch("childproc 30", bp::windows::hide, ec, bp::std_out > is, wait_child2_spawned, only_inherted_stdhandle());

	string line;
	while (ch.running() && getline(is, line)) {
		cout << "c1: " << line << endl;
	}

	ch.wait();
	cout << "c1 exited, code=" << ch.exit_code() << endl;
}

void spawn_child2()
{
	bp::ipstream is;
	std::error_code ec;
	auto denote_spawned = bp::extend::on_success([](auto &e) {
		g_child2_spawned = true;
	});
	bp::child ch("childproc 60", bp::windows::hide, ec, bp::std_out > is, denote_spawned, only_inherted_stdhandle());

	string line;
	while (ch.running() && getline(is, line)) {
		cout << "c2: " << line << endl;
	}

	ch.wait();
	cout << "c2 exited, code=" << ch.exit_code() << endl;
}

int main(int argc, char *argv[])
{
	unique_ptr<thread> th1{ new thread(spawn_child1) };
	unique_ptr<thread> th2{ new thread(spawn_child2) };
	th1->join();
	th2->join();
	cout << "main exited" << endl;
	return 0;
}

我们可以通过传入一个自定义的 handler_base 子类使得 boost 在调用 CreateProcess 之前调用我们的 on_setup 指定要继承的句柄。需要注意的是 only_inherted_stdhandle() 参数必须放在 bp::std_out > is 后面。

问题

下面留几个问题给读者思考:

  1. 为什么 only_inherted_stdhandle() 必须要在 bp::std_out > is 后面?
  2. 在 Linux 系统上会不会出现类似的问题,为什么?

参考