免费注册 查看新帖 |


  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 3763 | 回复: 0
打印 上一主题 下一主题

Embedding Python in Multi-Threaded C/C++ Applicat [复制链接]

1 [收藏(0)] [报告]
发表于 2005-10-07 15:49 |只看该作者 |倒序浏览
      By Ivan Pulleyn   
      Created 2000-05-01 01:00   
Python provides a clean intuitive interface to complex,          threaded applications.
Developers often make use of high-level scripting languages as a way of quickly writing flexible code.
Various shell scripting languages have long been used to automate processes on UNIX systems. More recently,
software applications have begun to provide scripting layers that allow the user to automate common tasks or
even extend the feature set. Think of all the well-known applications you use: GIMP, Emacs, Word, Photoshop,
etc. It seems as though all can be scripted in some way.
In this article, I will describe how you can embed the Python language within your C applications. There
are many reasons you would want to do this. For instance, you may want to provide your more advanced users
with the ability to alter or customize the program. Or maybe you want to take advantage of a Python
capability, rather than implement it yourself. Python is a good choice for this task because it provides a
clean, intuitive C API. Since many complex applications are written using threads, I will also show you how
to create a thread-safe interface to the Python interpreter.
All the examples assume you are using Python version 1.5.2, which comes pre-installed on most recent Linux
distributions. The API to access the Python interpreter is the same for both C and C++. There are no special
C++ constructs used, and all functions are declared extern ``C''. For this reason, the concepts described and
the example code given here should work equally well when using either C or C++.
Overview of the Python C/C++ API
There are two ways that C and Python code can work together within the same process. Simply put, Python
code can call C code or C code can call Python code. These two methods are called ``extending'' and
``embedding'', respectively. When extending, you create a new Python module implemented in C/C++. This allows
you to provide new functionality to the Python language that cannot be implemented in Python. For instance,
several core Python modules such as ``time'' and ``nis'' are implemented as C extensions, while others are
written in Python. You never notice the difference between C and Python modules, because the act of importing
and using these modules is the same. If you look around in your /usr/lib/python1.5 directory, you may see
some shared library files (extension .so). These are Python module extensions written in C. You will also see
various Python files (extension .py) which are modules written in Python.
Typically, when you embed Python, you will develop a C/C++ application that has the ability to load and
execute Python scripts. The application will be linked against the Python interpreter library, called
libpython1.5.a, which provides all functionality related to evaluating Python code. There is no Python
executable involved, only an API for your application to use.
Embedded Python
Listing 1 [1]
Embedding Python is a relatively straightforward process. If your goal is merely to execute vanilla Python
code from within a C program, it's actually quite easy. Listing 1 is the complete source to a program that
embeds the Python interpreter. This illustrates one of the simplest programs you could write making use of
the Python interpreter.
Listing 1 uses three Python-specific function calls. Py_Initialize starts
up the Python interpreter library, causing it to allocate whatever internal resources it needs. You must call
this function before calling most other functions in the Python API. PyEval_SimpleString provides a quick, no-frills way to execute arbitrary Python code.
Interpretation of the code is immediate. In the above example, for instance, the import sys line
causes Python to import the sys module before returning control to the C/C++
program. Each string passed to PyEval_SimpleString must be a complete Python statement of some kind. In other
words, half statements are illegal, even if they are completed with another call to PyRun_SimpleString. For
example, the following code will not work properly:
// Python will print first error here
PyRun_SimpleString("import ");  
// Python will print second error here
Py_Finalize is the last Python function which any application that embeds
Python must call. This function shuts down the interpreter and frees any resources it allocated during its
lifetime. You should call this when you are completely finished using the Python library. When you call
Py_Finalize, Python will unload all imported modules one by one. Many modules must execute their own clean-up
code when they are unloaded in order to free any global resources they may have allocated. For this reason,
calling Py_Finalize can have the side effect of causing quite a bit of other code to run.
PyEval_SimpleString is just one way to execute Python code from within
your C applications. In fact, there is a whole collection of similar high-level functions. PyEval_SimpleFile is just like PyEval_SimpleString, except it reads its input from a
FILE pointer rather than a character buffer. See the Python documentation at http://www.python.org/docs/api/veryhigh.html [2]
for complete documentation on these high-level functions.
In addition to evaluating Python scripts, you can also manipulate Python objects and call Python functions
directly from your C code. While this involves more complex C code than using PyEval_SimpleString, it also
allows access to more detailed information. For example, you can access objects returned from Python
functions or determine if an exception has been thrown.
Extending Python
When you embed Python within your application, it is often desirable to provide a small module that
exposes an API related to your application so that scripts executing within the embedded interpreter have a
way to call back into the application. This is done by providing your own Python module, written in C, and is
exactly the same as writing normal Python modules. The only difference is your module will function properly
only within the embedded interpreter.
Extending Python requires some understanding of how the Python interpreter manipulates objects from C. All
function arguments and return values are pointers to PyObject structures, which are the C representation of
real Python objects. You can make use of various function calls to manipulate PyObjects. Listing 2 is a
simple example of a Python module extension written in C. This is the source to the Python crypt module, which provides one-way hashing used in password authentication.
Listing 2 [3]
All C implementations of Python-callable functions take two arguments of type PyObject. The first argument
is always ``self'', the object whose method is being called (similar to the infamous ``this'' pointer in
C++). The second object contains all the arguments to the function. PyArg_Parse is used to extract values from a PyObject containing function arguments. You
do this by passing, in the PyObject which contains the values, a format string which represents the data
types you expect to be there, and one or more pointers to data types to be filled in with values from the
PyObject. In Listing 2, the function takes two strings, represented by "(ss)". PyArg_Parse is similar to the C function sscanf, except
it operates on a PyObject rather than a character buffer. In order to return a string value from the
function, call PyString_FromString. This helper function takes a
char* value and converts it into a PyObject.
Python, C and Threads
C programs can easily create new threads of execution. Under Linux, this is most commonly done using the
POSIX Threads (pthreads) API and the function call pthread_create. For an
overview of how to use pthreads, see ``POSIX Thread Libraries'' by Felix Garcia and Javier Fernandez at http://www.linuxjournal.com/lj-issues/issue70/3184.html [4] in the
``Strictly On-line'' section of LJ, February 2000. In order to support
multi-threading, Python uses a mutex to serialize access to its internal data structures. I will refer to
this mutex as the ``global interpreter lock''. Before a given thread can make use of the Python C API, it
must hold the global interpreter lock. This avoids race conditions that could lead to corruption of the
interpreter state.
The act of locking and releasing this mutex is abstracted by the Python functions PyEval_AcquireLock and PyEval_ReleaseLock. After
calling PyEval_AcquireLock, you can safely assume your thread holds the lock; all other cooperating threads
are either blocked or executing code unrelated to the internals of the Python interpreter, and you may now
call arbitrary Python functions. Once acquiring the lock, however, you must be certain to release it later by
calling PyEval_ReleaseLock. Failure to do so will cause a thread deadlock and freeze all other Python
To complicate matters further, each thread running Python maintains its own state information. This
thread-specific data is stored in an object called PyThreadState. When calling Python API functions
from C in a multi-threaded application, you must maintain your own PyThreadState objects in order to safely
execute concurrent Python code.
If you are experienced in developing threaded applications, you might find the idea of a global
interpreter lock rather unpleasant. Well, it's not as bad as it first appears. While Python is interpreting
scripts, it periodically yields control to other threads by swapping out the current PyThreadState object and
releasing the global interpreter lock. Threads previously blocked while attempting to lock the global
interpreter lock will now be able to run. At some point, the original thread will regain control of the
global interpreter lock and swap itself back in.
This means when you call PyEval_SimpleString, you are faced with the unavoidable side effect that other
threads will have a chance to execute, even though you hold the global interpreter lock. In addition, making
calls to Python modules written in C (including many of the built-in modules) opens the possibility of
yielding control to other threads. For this reason, two C threads that execute computationally intensive
Python scripts will indeed appear to share CPU time and run concurrently. The downside is that, due to the
existence of the global interpreter lock, Python cannot fully utilize CPUs on multi-processor machines using
Enabling Thread Support
Before your threaded C program is able to make use of the Python API, it must call some initialization
routines. If the interpreter library is compiled with thread support enabled (as is usually the case), you
have the runtime option of enabling threads or not. Do not enable runtime threading support unless you plan
on using threads. If runtime support is not enabled, Python will be able to avoid the overhead associated
with mutex locking its internal data structures. If you are using Python to extend a threaded application,
you will need to enable thread support when you initialize the interpreter. I recommend initializing Python
from within your main thread of execution, preferably during application startup, using the following two
lines of code:
// initialize Python
// initialize thread support
Both functions return void, so there are no error codes to check. You can now assume the Python
interpreter is ready to execute Python code. Py_Initialize allocates global
resources used by the interpreter library. Calling PyEval_InitThreads turns
on the runtime thread support. This causes Python to enable its internal mutex lock mechanism, used to
serialize access to critical sections of code within the interpreter. This function also has the side effect
of locking the global interpreter lock. Once the function completes, you are responsible for releasing the
lock. Before releasing the lock, however, you should grab a pointer to the current PyThreadState object. You
will need this later in order to create new Python threads and to shut down the interpreter properly when you
are finished using Python. Use the following bit of code to do this:
PyThreadState * mainThreadState = NULL;
// save a pointer to the main PyThreadState object
mainThreadState = PyThreadState_Get();
// release the lock
Creating a New Thread of Execution
Python requires a PyThreadState object for each thread that is executing Python code. The interpreter uses
this object to manage a separate interpreter data space for each thread. In theory, this means that actions
taken in one thread should not interfere with the state of another thread. For instance, if you throw an
exception in one thread, the other snippets of Python code keep running as if nothing happened. You must help
Python to manage per-thread data. To do this, manually create a PyThreadState object for each C thread that
will execute Python code. In order to create a new PyThreadState object, you need a pre-existing
PyInterpreterState object. The PyInterpreterState object holds information that is shared across all
cooperating threads. When you initialized Python, it created a PyInterpreterState object and attached it to
the main PyThreadState object. You can use this interpreter object to create a new PyThreadState for your own
C thread. Here's some example code which does just that (ignore line wrapping):
// get the global lock
// get a reference to the PyInterpreterState
PyInterpreterState * mainInterpreterState = mainThreadState->interp;
// create a thread state object for this thread
PyThreadState * myThreadState = PyThreadState_New(mainInterpreterState);
// free the lock
Executing Python Code
Now that you have created a PyThreadState object, your C thread can begin to use the Python API to execute
Python scripts. You must adhere to a few simple rules when executing Python code from a C thread. First, you
must hold the global interpreter lock before doing anything that alters the state of the current thread
state. Second, you must load your thread-specific PyThreadState object into the interpreter before executing
any Python code. Once you have satisfied these constraints, you can execute arbitrary Python code by using
functions such as PyEval_SimpleString. Remember to swap out your PyThreadState object and release the global
interpreter lock when done. Note the symmetry of ``lock, swap, execute, swap, unlock'' in the code (ignore
line wrapping):
// grab the global interpreter lock
// swap in my thread state
// execute some python code
PyEval_SimpleString("import sys
PyEval_SimpleString("sys.stdout.write('Hello from a C thread!
// clear the thread state
// release our hold on the global interpreter
Cleaning Up a Thread
Once your C thread is no longer using the Python interpreter, you must dispose of its resources. To do
this, delete your PyThreadState object. This is accomplished with the following code:
// grab the lock
// swap my thread state out of the interpreter
// clear out any cruft from thread state object
// delete my thread state object
// release the lock
This thread is now effectively done using the Python API. You may safely call pthread_exit at this point to halt execution of the thread.
Shutting Down the Interpreter
Once your application has finished using the Python interpreter, you can shut down Python support with the
following code:
// shut down the interpreter
Note there is no reason to release the lock, because Python has been shut down. Be certain to delete all
your thread-state objects with PyThreadState_Clear and PyThreadState_Delete before calling Py_Finalize.
Python is a good choice for use as an embedded language. The interpreter provides support for both
embedding and extending, which allows two-way communication between C application code and embedded Python
scripts. In addition, the threading support facilitates integration with multi-threaded applications without
compromising performance.
You can download example source code at ftp://ftp.ssc.com/pub/lj/listings/issue73/3641.tgz [5]. This includes an example implementation
of a multi-threaded HTTP server with an embedded Python interpreter. In order to learn more about the
implementation details, I recommend reading the Python C API documentation at http://www.python.org/docs/api/ [6]. In addition, I have found
the Python interpreter code itself to be an invaluable reference.
Ivan Pulleyn can be reached via e-mail at ivan@torpid.com.
email: ivan@torpid.com [7]

      [1] http://www.linuxjournal.com//articles/lj/0073/3641/3641l1.html
[2] http://www.linuxjournal.com/
[3] http://www.linuxjournal.com//articles/lj/0073/3641/3641l2.html
[4] http://www.linuxjournal.com//article/3184
[5] ftp://ftp.ssc.com/pub/lj/listings/issue73/3641.tgz
[6] http://www.linuxjournal.com/
[7] http://www.linuxjournal.com/mailto:ivan@torpid.com
[8] https://www.ssc.com/lj/subs/NewUSA.html
      Source URL:

您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复


北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP