Convert a R script into PythonΒΆ
Links: notebook
, html, PDF
, python
, slides, GitHub
This notebook introduces the function r2python which converts R into Python. It does not work for eveything, it is being improved everytime it is needed. This notebook was executed with the following versions:
import sys
print(sys.version)
3.10.5 (tags/v3.10.5:f377153, Jun 6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)]
text = !python -m pip freeze antlr4-python3-runtime
[t for t in text if "antlr" in t]
['antlr4-python3-runtime==4.10']
A script as an example:
rscript = """
nb=function(y=1930){
debut=1816
MatDFemale=matrix(D$Female,nrow=111)
colnames(MatDFemale)=(debut+0):198
cly=(y-debut+1):111
deces=diag(MatDFemale[:,cly[cly%in%1:199]])
return(c(B$Female[B$Year==y],deces))}
"""
from pyensae.languages.rconverter import r2python
print(r2python(rscript, pep8=True))
from python2r_helper import make_tuple
def nb(y=1930):
debut = 1816
MatDFemale = matrix(D . Female, nrow=111)
colnames(MatDFemale) .set(range((debut + 0), 198))
cly = range((y - debut + 1), 111)
deces = diag(MatDFemale[:, cly[set(cly) & set(range(1, 199))]])
return make_tuple(B . Female[B . Year == y], deces)
It adds some not implemented function such as
colnames(MatDFemale) .set(range((debut + 0), 198))
because the
original syntax colnames(MatDFemale)=debut+0:198
does not work in
Python. The conversion does not fix indices (first position is zero in
Python and 1 in R). The bracket (debut+0):198
are needed to tell
the converter the beginning of the expression. The operator %in%
is
converted into a set intersection.
The unit tests check the function is working on the following list of
example
unittests/ut_languages/data.
Anything not included in that list might require a few code change. Some
instructions colnames(MatDFemale) .set(range((debut + 0), 198))
should probably be rewritten.
import numpy
def matrix(array, nrow=None):
arr = numpy.array(array)
if nrow is not None:
ncol = len(arr) // nrow
arr = numpy.resize(arr, new_shape=(nrow, ncol))
return arr
def colnames(df):
if isinstance(df, pandas.DataFrame):
return list(df.columns)
raise TypeError(type(df))
def make_tuple(*el, aslist=True):
if aslist:
return list(el)
return tuple(el)