pygmt.clib.Session.virtualfile_to_dataset
- Session.virtualfile_to_dataset(vfname, output_type='pandas', header=None, column_names=None, dtype=None, index_col=None)[source]
Output a tabular dataset stored in a virtual file to a different format.
The format of the dataset is determined by the
output_type
parameter.- Parameters:
vfname (
str
) – The virtual file name that stores the result data. Required for"pandas"
and"numpy"
output type.output_type (
Literal
['pandas'
,'numpy'
,'file'
,'strings'
], default:'pandas'
) –Desired output type of the result data.
"pandas"
will return apandas.DataFrame
object."numpy"
will return anumpy.ndarray
object."file"
means the result was saved to a file and will returnNone
."strings"
will return the trailing text only as an array of strings.
header (
int
|None
, default:None
) – Row number containing column names for thepandas.DataFrame
output.header=None
means not to parse the column names from table header. Ignored if the row number is larger than the number of headers in the table.column_names (
list
[str
] |None
, default:None
) – The column names for thepandas.DataFrame
output.dtype (
type
|dict
[str
,type
] |None
, default:None
) – Data type for the columns of thepandas.DataFrame
output. Can be a single type for all columns or a dictionary mapping column names to types.index_col (
str
|int
|None
, default:None
) – Column to set as the index of thepandas.DataFrame
output.
- Return type:
- Returns:
result – The result dataset. If
output_type="file"
returnsNone
.
Examples
>>> from pathlib import Path >>> import numpy as np >>> import pandas as pd >>> >>> from pygmt.helpers import GMTTempFile >>> from pygmt.clib import Session >>> >>> with GMTTempFile(suffix=".txt") as tmpfile: ... # prepare the sample data file ... with Path(tmpfile.name).open(mode="w") as fp: ... print(">", file=fp) ... print("1.0 2.0 3.0 TEXT1 TEXT23", file=fp) ... print("4.0 5.0 6.0 TEXT4 TEXT567", file=fp) ... print(">", file=fp) ... print("7.0 8.0 9.0 TEXT8 TEXT90", file=fp) ... print("10.0 11.0 12.0 TEXT123 TEXT456789", file=fp) ... ... # file output ... with Session() as lib: ... with GMTTempFile(suffix=".txt") as outtmp: ... with lib.virtualfile_out( ... kind="dataset", fname=outtmp.name ... ) as vouttbl: ... lib.call_module("read", f"{tmpfile.name} {vouttbl} -Td") ... result = lib.virtualfile_to_dataset( ... vfname=vouttbl, output_type="file" ... ) ... assert result is None ... assert Path(outtmp.name).stat().st_size > 0 ... ... # strings output ... with Session() as lib: ... with lib.virtualfile_out(kind="dataset") as vouttbl: ... lib.call_module("read", f"{tmpfile.name} {vouttbl} -Td") ... outstr = lib.virtualfile_to_dataset( ... vfname=vouttbl, output_type="strings" ... ) ... assert isinstance(outstr, np.ndarray) ... assert outstr.dtype.kind in ("S", "U") ... ... # numpy output ... with Session() as lib: ... with lib.virtualfile_out(kind="dataset") as vouttbl: ... lib.call_module("read", f"{tmpfile.name} {vouttbl} -Td") ... outnp = lib.virtualfile_to_dataset( ... vfname=vouttbl, output_type="numpy" ... ) ... assert isinstance(outnp, np.ndarray) ... ... # pandas output ... with Session() as lib: ... with lib.virtualfile_out(kind="dataset") as vouttbl: ... lib.call_module("read", f"{tmpfile.name} {vouttbl} -Td") ... outpd = lib.virtualfile_to_dataset( ... vfname=vouttbl, output_type="pandas" ... ) ... assert isinstance(outpd, pd.DataFrame) ... ... # pandas output with specified column names ... with Session() as lib: ... with lib.virtualfile_out(kind="dataset") as vouttbl: ... lib.call_module("read", f"{tmpfile.name} {vouttbl} -Td") ... outpd2 = lib.virtualfile_to_dataset( ... vfname=vouttbl, ... output_type="pandas", ... column_names=["col1", "col2", "col3", "coltext"], ... ) ... assert isinstance(outpd2, pd.DataFrame) >>> outstr array(['TEXT1 TEXT23', 'TEXT4 TEXT567', 'TEXT8 TEXT90', 'TEXT123 TEXT456789'], dtype='<U18') >>> outnp array([[1.0, 2.0, 3.0, 'TEXT1 TEXT23'], [4.0, 5.0, 6.0, 'TEXT4 TEXT567'], [7.0, 8.0, 9.0, 'TEXT8 TEXT90'], [10.0, 11.0, 12.0, 'TEXT123 TEXT456789']], dtype=object) >>> outpd 0 1 2 3 0 1.0 2.0 3.0 TEXT1 TEXT23 1 4.0 5.0 6.0 TEXT4 TEXT567 2 7.0 8.0 9.0 TEXT8 TEXT90 3 10.0 11.0 12.0 TEXT123 TEXT456789 >>> outpd2 col1 col2 col3 coltext 0 1.0 2.0 3.0 TEXT1 TEXT23 1 4.0 5.0 6.0 TEXT4 TEXT567 2 7.0 8.0 9.0 TEXT8 TEXT90 3 10.0 11.0 12.0 TEXT123 TEXT456789