By Andy May
This is an introductory post to a series on using R to read IGRA2 radiosonde data, process it, and produce both plots and maps of the data. I started using R over 10 years ago mainly because it was a free and very powerful language for statistical analysis (download the current 64-bit Windows version here). At the time, it was a clunky programming language and difficult to use, but that has recently changed. While working on new R programs to analyze the radiosonde data I saw the many substantial improvements to the language added since around 2020. It is now a very impressive language and much easier to use and to read. Before we get into the radiosonde analysis, I’d like to cover the recent improvements in the language. Future posts in this series will provide more details about the R language and my analysis of IGRA2.
Reading and writing data
I used the original “base” R “readLines” function to read the IGRA2 files because each record had to be read as a text string and later parsed into its component values. For this purpose, readLines is ideal and efficient. However, for efficiency, once the records were parsed and prepared for processing in R, I used “fwrite” from the data.table R package to write the resulting data frames (data frames are R matrix structures that contain a mixture of data types) or tibbles (a more modern data frame structure of variables) to disk. Both data frames and tibbles are organized as a table where observations are rows and variables or measurements are columns. Different columns can have different variable types, for example character, float, or integer data types.
The function fwrite was written by Otto Seiskari and Matt Dowle and first released in 2016. It became fully parallel by 2020; it is 10-100 times faster than the alternative write functions. For reading comma delimited files (CSV files) the companion fread function is also very fast and efficient.
dplyr: data manipulation
In June 2020 the very useful dplyr R package, written by Hadley Wickham, was released. The operators and functions in this package are widely used in my IGRA2 processing. The %>% operator (called a ‘pipe’) is especially useful because it makes R code so much more readable and intuitive (see below). Dplyr is used to organize data into subsets (filter, select, arrange, group functions), add variables to an existing data frame or tibble (mutate function), or compute values that summarize a group (summarise function). Other useful column functions are across, rename, relocate, and pull. Row or observations manipulation functions include rowwise, slice, distinct, and across.
Pipes (%>%) for making code more readable
Pipes are supplied by magrittr, which is loaded when dplyr is loaded. In summary, the value to the left of the pipe (lhs), is placed as the first argument of the function on the righthand side (rhs), that is:
lhs %>% rhs is the same as rhs(lhs).
Simple Example
Without using a pipe, the following nested and inside out code is used:
round(mean(sqrt(c(1, 4, 9, 16)), na.rm = TRUE), 2)
# → 2.5
With a pipe we can write this chained and readable equivalent code:
c(1, 4, 9, 16) %>%
sqrt() %>%
mean(na.rm = TRUE) %>%
round(2)
# → 2.5
Use the dot ‘.’ placeholder when you need the piped value in a non-first position:
df %>% lm(y ~ x, data = .) # ‘.’ represents the lhs (df)
tidyverse: integrating data manipulation and plotting
Hadley Wickham, and his collaborators including Romain François, Winston Chang, Garrett Grolemund, Lionel Henry, and others developed and integrated the components of the tidyverse suite of consistent packages for modern R workflows. The key components of tidyverse are ggplot2, tidyr, dplyr, readr, purrr, and tibble. Tidyverse was a mature set of programming tools by 2019 when the paper: “Welcome to the Tidyverse” was published.
Using concepts developed by Hadley Wickham and described in his paper “Tidy Data” tidyverse is designed to help the user clean up messy datasets like IGRA2 (Wickham, 2014). As Wickham says, 80% of data analysis is spent on the process of cleaning and preparing the data. Much of the preparation time is spent preparing the output from one tool or device so it can be input to another. Tidyverse provides a wide range of tools and display functions that all work on basic R data frames and tidyverse tibbles. Further it allows tibbles to be easily translated into data frames.
ggplot2: the plotting workhorse
Ggplot2 is a very powerful part of tidyverse that allows useful and attractive graphs and maps to be made. All the graphs in my most recent paper, except one, were made using ggplot2 (May, 2025).
Hadley Wickham created the initial version of ggplot2 during his PhD studies at Iowa State University, inspired by Leland Wilkinson’s 1999 book The Grammar of Graphics. It provided a more structured alternative to R’s base graphics. The first general version was available in 2014, but a full and modern mapping capability had to wait for the introduction of terra in 2022 by Robert Hijmans. Terra did not integrate very well with ggplot and tidyverse at first and this required a major update to ggplot2 which was completed in September of 2025.
Terra: Mapping
Terra was developed to replace the raster and sp packages (first released around 2010 and 2005 respectively), which were the cornerstones of mapping in R before terra was released and matured around 2023. Robert J. Hijmans (University of California, Davis) created terra to address problems with the earlier R mapping system, terra has a simpler interface, faster performance, expanded capabilities (e.g., better vector integration), and streamlined data classes.
Terra provides efficient methods for geometric operations, local/focal/zonal/global computations, spatial predictions (e.g., via interpolation or machine learning models), and processing of very large files. Key data classes are SpatRaster (replacing multiple raster classes) and SpatVector, which provides robust vector handling. Terra is written in C++ for speed.
Tidyterra: integrating data manipulation and mapping
Tidyterra is a package developed by Diego Hernangómez that successfully allows tidyverse and ggplot2 to work with the spatial mapping package terra. Finally, ggplot2 display functions and the tidyverse data manipulation functions can work seamlessly with the terra mapping functions (Hernangómez, 2023).
Tidyterra extends the functionality of the ggplot2 package by providing additional functions specific to mapping like geom_spatraster and geom_spatvector, as well as other functionality specifically designed for map production. Spatvector objects are lines and polygons and Spatraster objects consist of equal-sized rectangles that contain one or more values (Hernangómez, 2023).
Summary
Thus, with the completion of all this work between 2023 and 2025, R became a very robust data processing and display programing language. In the next few posts, I will provide some critical details about how I used this powerful new version R to make the data displays in May, 2025 and in the paper’s supplementary materials (May, 2025b).
Works Cited
Hernangómez, D. (2023). Using the tidyverse with terra objects: the tidyterra package. J of Open Source Software, 8(91). https://doi.org/10.21105/joss.05751
May, A. (2025). The Molar Density Tropopause Proxy and its relation to the ITCZ and Hadley Circulation. OSF. https://doi.org/10.17605/OSF.IO/KBP9S
May, A. (2025b, November 28). Supplementary Materials: The Molar Density Tropopause Proxy and Its Relation to the ITCZ and Hadley Circulation. https://doi.org/10.5281/zenodo.17752293
Wickham, H. (2014). Tidy Data. J of Statistical Software, 59(10). https://doi.org/10.18637/jss.v059.i10
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., & François, R. (2019). Welcome to the Tidyverse. Journal of Open Source Software, 4(43). https://doi.org/10.21105/joss.01686
Thanks, Andy. I use R exclusively for my work. A question.
You recommend fread and fwrite for saving and reading csv files and data frames to-from disk. Do they work for arrays?
I ask because I’ve never used a csv or data.frame file so large that read or write times were an issue. The big data I use are all 3-d arrays, which (near as I can tell) are not read or written by fread/fwrite.
Great article,
w.
Thanks Willis. You can read or write csv arrays with fread and fwrite, but fread always reads into a data.table or data.frame. You should always be able to handle a data.table or data.frame like it was an array, but you will have many more options for manipulating it. If you have to turn it into an array you can, as follows (from Grok):
library(data.table)
dt <- data.table(
a = 1:5,
b = 11:15,
c = 21:25
)
# Option A: Convert to matrix (most frequent choice)
m <- as.matrix(dt)
# or
m <- as.matrix(dt[]) # the [] sometimes helps with certain edge cases
class(m) # → “matrix” “array”
dim(m) # → 5 3
If you can read data with fread and fwrite, it will almost always be much faster.
Reading & writing various formats including particularly CSVs is very important . CoSy has an extensive vocabulary mainly at https://cosy.com/4thCoSy/Code/CoSy/Furniture.f .
Tried it. No joy. I store my gridded global data time series as 3D arrays (lat, long, time). This appears to be incompatible with fread and fwrite.
Best regards,
w.
fread and fwrite are for tables of data. If you want to try again, explain to Grok what you want to do, for example: read a csv with fread and convert to a 3D array, and ask it if it can find a way to do it. include some example data. I usually put together questions in a text editor.
Are you using netCDF or GRIB file formats? If so which packages in R are using to read these files?
library(RNetCDF)
or
library(ncdf4)
RNetCDF usually works for most netCDF files. I don’t know about GRIB files.
For examples see the R code linked at the bottom of this post:
https://andymaypetrophysicist.com/2020/12/12/the-ocean-mixed-layer-sst-and-climate-change/
Willis,
WRT 3D arrays, the usual way this is handled with data frames and data tables is to make a list of them. So, one might have a large number of data frames, they can all be the same size or not, and them compile them into a list. It is common to do this and there are many functions built to manipulate them.
Why not just use the array? I store my gridded global data time series as 3D arrays (lat, long, time). That lets me use functions like
apply(somearray, 3, sum)
to get month-by-month sums. Seems to me this would be far more complex if they were converted to lists. But maybe I’m not understanding what you mean.
w.
If your data is ideally stored in 3D arrays, I don’t think a list of data.frames would improve or add anything, unless you need the subset, grouping, summarize, and filter options. Those options are nice, but in your case, they may not be useful. Base R standard array index manipulation can do a lot.
Thanks for this, finally something I can agree with. As a hobbiest, discovering the Tidyverse packages revolutionised how I used R. I definitly need to look into Terra now.
One small quibble – I’m pretty sure that dplyer has been around since well before 2021.
plyr, the early version of dplyr, was released in 2011. dplyr was officially released in 2014, mainly for testing and debugging purposes. The 1.0.0 release, which is what I referred to in the post was not released until June 2020. Prior to the 1.0.0 release, it was not widely used and had problems. It still has a few problems but is very usable today, not so much in 2014-2019. As with all software, there are always bugs, but they become more manageable with time.
I wouldn’t say the releases before 1.0.0 as mainly fir testing purposes. I’m sure I kept reading articles recommending it for general use well before 2020.
A search for dpylr for 2014 shows many article recommending it. None I coukd find saying it should be used for testing.
I see I had a typo in there, 2021 versus 2020. Thanks for catching it, I fixed it.
A good write up, thanks
I must be missing something here. Isn’t the mean of the sqrt of 1, 4, 9 and 16 going to be 2.5?
eg (1+2+3+4)/4
Or should it be the sqrt of (1+4+9+16)/4 = 2.7
Where did 2.9 come from?
Argggh! Quite right, it is 2.5 for both. I just ran them through R, which I should have done in the first place to make sure. The example is from Grok, which makes a lot of mistakes. I usually check every bit of code that Grok writes, since there is usually a bug in it somewhere. But I skipped that critical step this time, my fault.
The advantage of using Grok to help with your programming is it often comes up with novel code that you would not have thought of, and it is better. The disadvantage is that Grok makes a lot of mistakes and you must check everything! Here are the runs through R:
> round(mean(sqrt(c(1, 4, 9, 16)), na.rm = TRUE), 2)
[1] 2.5
> c(1, 4, 9, 16) %>%
+ sqrt() %>%
+ mean(na.rm = TRUE) %>%
+ round(2)
[1] 2.5
So SuperCalc and Lotus 123 aren’t in vogue any more?
I use R all the time. Any idea how I can get Excel and R to “talk” with each other? For example, to make a prediction on new live data feeding into Excel it would be nice to call the predict function from R to make the forecast. For simple linear regression models I just copy the regression coefficients from R over to Excel to make predictions on the fly in Excel. But that’s more difficult to do with more complicated models created using GAM or LOESS. Any thoughts?
When I need to transfer anything, I use csv files for the transfer.
But I find myself using Excel less and less these days, preferring to stay in R. Lately, especially when using Grok to help me code, I find it easier to do everything in R. However, Grok does make a lot of mistakes, so you need to take time to check everything. The nice part about Grok is you don’t have to spend as much time trying to figure out how to code your idea, the time goes into debugging Grok’s code, although Grok helps with that also.
I use the readxl package. Here is a quick snippet for pulling values out of Excel.
library("readxl") sheet <- read_excel("path-to-file", sheet = "name-of-sheet") yvalues <- sheet[["XValues"]] yvalues <- yvalues[!is.na(yvalues)] xvalues <- sheet[["YValues"]] xvalues <- xvalues[1:length(yvalues)]That’s my usual go-to method, although it’s not automated as Bob wanted.
w.
Bob, as usual, I asked this question of my AI of choice, perplexity.ai. Here’s my question.
===
You are an expert in R and Excel. Is it possible to write an Excel macro to call an R function and return the results to Excel?
===
and here’s the answer
===
Yes, it is possible, and there are several workable patterns depending on how tightly you want Excel and R to be integrated and whether you are on Windows-only or cross‑platform. [1] [2] [3]
Below are the main approaches that actually work in practice.
1. Call R from VBA via command line
This is the lowest‑friction method and works anywhere Rscript.exe is available (Windows). [1]
Basic idea:
• Write a VBA macro that:
• Reads input values from cells.
• Writes them to a CSV or passes them as command‑line args.
• Uses Shell to call Rscript.exe with an .R script.
• The R script reads the inputs, runs your R function, and writes results (e.g., CSV, TXT, or directly to an XLSX file via openxlsx).
• The macro then imports those results back into Excel. [4] [5] [1]
Skeleton VBA:
Sub RunR()
Dim cmd As String
Dim rPath As String
Dim scriptPath As String
rPath = “””C: \Program Files\R\R-4.3.1\bin\Rscript.exe””” ‘ adjust
scriptPath = “””C: \path\to\your_script.R”””
cmd = rPath & ” “ & scriptPath & ” “ & Range(“A1”).Value
Shell cmd, vbNormalFocus
‘ Then read results written by R (e.g., from a CSV)
End Sub
R side:
args <• commandArgs(trailingOnly = TRUE)
x <• as.numeric(args [1])
source(“your_functions.R”)
res <• your_fun(x)
write.csv(data.frame(res = res), “C:/path/to/result.csv”, row.names = FALSE)
This is stateless (each call starts a fresh R session) and synchronous if you have VBA wait for the process to finish. [2] [1]
2. Use an Excel–R bridge add‑in (BERT, xlRcode, excel.link)
Several tools provide a more direct “call R from Excel/VBA and get the return value” model:
• BERT (Basic Excel R Toolkit): Exposes R functions directly to Excel and to VBA via Application.Run. [3]
• You define an R function, e.g.:
Add <• function(…) {
sum(…)
}
BERT exports this to Excel as R.Add, and from VBA you can do:
Sub Test()
Dim x As Double
x = Application.Run(“R.Add”, 1, 2, 3, 4)
MsgBox x
End Sub
That gives you a direct scalar return value into VBA/Excel. [3]
• xlRcode: An Excel add‑in that defines worksheet/VBA functions like XLRFUNC/XLRFUNC_ENV that call arbitrary R functions and transfer values to and from Excel ranges. [6]
• Designed specifically to “Call R from Excel. Create new Excel functions that make use of R packages. Integrate both tools seamlessly.” [6]
• excel.link package (R side) + provided Excel macros: includes an ExecInR helper that lets an Excel macro send code to R and optionally get back output as a string, in a stateless synchronous way. [2]
These are the most natural if you want “macro calls R, gets result back into cells” without hand‑rolling CSV I/O.
3. RExcel / statconn (older, more complex stack)
RExcel (statconn) historically provided tight Excel↔R integration, including menu items, data transfer, and calling R from Excel. [7]
• However, it is Windows‑centric, somewhat dated, and the setup is heavier than BERT/xlRcode. [4] [7]
• It still does what you are asking, but given your profile you might prefer a simpler modern stack like BERT or xlRcode.
4. Via Rserve or a REST API
If you want a long‑lived R process:
• Run Rserve from R, and have VBA talk to it via a custom COM or socket client; this gives a persistent R session but requires more plumbing. [4]
• Alternatively, host R code behind a REST API (e.g., plumber), and call it from VBA using WinHTTP, then parse JSON back into Excel. [8]
This becomes attractive if you want multiple workbooks/users talking to the same R service or need to avoid repeated startup overhead.
5. Practical recommendation
Given your background and desire to “call an R function and return results” rather than just fire a script:
• For quickest working prototype on Windows:
• Install BERT or xlRcode.
• Define your R function(s) in the R side.
• Call them from VBA with Application.Run (BERT) or the appropriate XLR* helper (xlRcode) to get scalar or array results back into Excel ranges. [6] [3]
• If you prefer no add‑ins and are okay with stateless calls:
• Use the Shell + Rscript.exe pattern, passing inputs and reading outputs via files. [1] [4]
If you describe your specific use case (scalar vs vectors, latency tolerance, deployment constraints), a concrete VBA+R minimal example tailored to that can be outlined.
1 2 3 4 5 6 7 8
Thanks for this discussion. I myself have been using MATLAB since before it was called MATLAB, when it was just a couple of matrix inversion packages for mainframe computing. Over the years, I have eschewed using R in lieu of the (in my view then) more powerful MATLAB statistical toolboxes, coupled with the associated other capabilities and toolboxes involving image analysis, optimization, PDEs, ML, symbolic math, large scale intrinsic vectorization, etc. etc. and I have been teaching students to use it in analysis of lab experimental data for some years. But I have been re-thinking that perspective as AI and LLMs have become ever more powerful owing to virtually unlimited availability of memory and computational power by virtue of the cloud.
The world of AI seems to be comprised of largely non-proprietary codes such as Python, C++, JavaScript, etc. and I presume R in its present form has or will find a place in that world. As a result I have been revisiting my familiarity with all of the open source codes and this convinces me to add R to the list. And continue rethinking inclusion of these codes in teaching practice. One nice thing I have found about AI is that if you have some code in MATLAB (or other platforms) it will convert that code to whatever open source context you want and debug issues for you along the way. I have not tried it, but I suspect it will also write code for you if you just ask it nicely. It is an interesting world we live in.
I agree with you, and I think you are making the right decision. I would not have said this before my most recent project, but R, when combined with Grok, is extremely powerful.
My practice now is to design the program I want in detail, step-by-step. I then write some preliminary R code myself, sometimes leaving some gaps where I’m not quite sure how to code what I want. Then I supply Grok with the preliminary code and my step-by-step conceptual design and ask it to review and critique what I’ve done and make recommendations for the missing bits. I think you can do this all at once, but often I do it piecemeal.
Grok is very detailed in its response, which is very helpful, Co-pilot and Chat-GPT are not as good at this. Once the preliminary program is written and tested and seems to work, I ask Grok to review it and comment, it often has further ideas for improvements. This continues to the end result. I’ve tried it a bit with Python, but R is better and produces smaller and more efficient code. I know very little about MatLab, but I’ve seen it produce some nice graphics.
I should add that when working with Grok on writing code, make sure you give it detailed and complex example input data. This should be obvious, but I have forgotten that before.
This is a cool article Andy. I use R for data analysis as well. It can do so much. I wish I know it better than I do. It’s just that I already have to deal with so many other programming languages on a daily basis that I find my R skills to be subpar and I forget how to do even basic stuff occasionally because I don’t use it as frequently. I am working on a project in my professional life in which I’m trying to incorporate R so hopefully I can become more proficient.
Anyway, take a look at the vcovHAC function. This is how I compute the uncertainty of the linear regression trends I post here.
Thanks, looks interesting.
Learning APL in the mid 70s to understand the multidimensional geometry and its algebra underlying pattern recognition and associative memory thru being able to interactively execute the succinct expression of the math .
R , I believe , openly borrows some concepts from APL .
CoSy is my notekeeping environment evolved since then in open FORTH . It’s rules are Simple . It’s a true human level language in which one thinks in terms
of defining words rather than writing programs . CoSy is human level also in being a rich vocabulary . The currently ~ 2300 word vocabulary is open from parsing and array operators to the ( currently x86/7 ) chip .
The example under Pipes above would be expressed in current development CoSy as
f( 1 4 9 19 )f sqrtf avgf
which returns 2.5897… on the data stack for the next word down the line .
I’d leave the rounding until outputting .
To make a word out of that operation would simply be :
: sqrtavg sqrtf avgf ;
Then I can write
10 _iotaf sqrtavg | resulting |>| 1.93060
for the numbers 0 thru 9 .
This allows the definition of the Planck thermal power function in a line :
| ( WaveLength Temperature — EnergyDensity )
: Planckl *f [ h 2. _f *f c ^2f *f ]+ $ [ h c boltz %f *f ]+ $ %f expf f1. -f %f ;
| Atomic : ie: applies to entire lists of lists of WLs & Ts .
I’ll be adding this verb to https://cosy.com/4thCoSy/Code/Physics/general.f .
CoSy is open and freely downloadable at CoSy.com/4thCoSy/ . Contact me if you’d like to schedule a Zoom , or have a specific domain you’d like to tame .
Whoops .
f( 1 4 9 16 )f sqrtavg |>| 2.50000
And , for the heck of it , here’s the result for the squares of 1 thru 100000 .
100000 _iotaf
f 1 +ff 2 ^f sqrtavg |>| 50000.5000There used to be a code description for reusable code, “WORM”—Write Once, Read Many.
C++, on the other hand, was described as a “WORN” language—Write Once, Read Never.
Seems like CoSy is that kind of language.
In R, I’d use the pipe operator %>% and write that as
c(1,4,9,16) %>%
sqrt() %>%
mean()
I can come back in ten years and say exactly what that does.
Best to you,
w.
Thank you for the feedback . I think any precise succinct notation is subject to the WORN accusation . ( C++ was ?? )
Certainly APL has been .
But I learned it because it actually generalized the notation of linear algebra which if you don’t know it , is inscrutable .
And unless you know it , you can’t even think as succinctly .
See Simplicity for the rules of CoSy , the language , in one page .
I find your R example of ” hardwraps a line of text to a certain character width. ” not only inscrutable , but verbose .
I went looking in https://cosy.com/4thCoSy/Code/CoSy/Furniture.f which has a lot of vocabulary for slicing and dicing , and found I’ve made
which does the essential modulo calculation . So I can write a line like
res> .. “bl ss 35 _i wrapidxs cut
to split the line above to 35 characters on blanks .
(
hardwraps a line of text to a certain
character width. ” not only inscrutable
, but verbose .
)
Note , that’s a list with the 3 lines as leafs .
In any case , more relevant is , for instance , the Planck distribution .
In ` polished ( numerically genericized ) CoSy that will be
: Planckl * [ h 2. * c ^2 * ]+ $ [ h c boltz % * ]+ $ % exp 1. – % ;
and will calculate entire spectra or sets of spectra .
Let’s see that in R .
And if you can understand the definition in standard math notation , there are only a few additional words I should need to explain .
Interestingly I got an email this morning I’ve posted to Tekydome with a link to Dan Good – The Two Greatest Lost Programming Languages , FORTH and APL . I haven’t gotten to the APL portion yet but the history of FORTH with astronomical and space applications ought be of interest .
Bob A
αlphas .Vita . Simplicity . Breaking Paradigms . #CoSyAnyDay . Musk : polish . Tekydome . Daily Blog
Thanks, Bob. Generally, on my planet, verbosity is a feature and not a bug. I can come back to code I wrote twenty years ago and immediately understand what it does.
So I make little effort to boil my code down to the most compact version.
And yes, FORTH was a most fascinating language that I could write in at one time. But that was long ago and in another galaxy …
Best to you and yours,
w.
Bob, I asked Perplexity to compare and contrast R and CoSy. Here’s the answer:
===
R is a mature, general-purpose statistical computing environment with a huge ecosystem, while CoSy is an experimental APL/Forth–inspired array language project aimed at an extremely compact “ultimate” universal language.[1]
Core purpose· R is designed for statistics, data analysis, graphics, and reproducible research, and has become a standard tool in data science and many scientific fields.[2]
· CoSy is a research/enthusiast project “towards [an] ultimate universal programming language merging the APL and FORTH families,” focusing on minimal, composable primitives and interactive notation rather than a broad applied ecosystem.[1]
Language model and paradigm· R is primarily a vectorized functional language with S-style syntax, lexical scoping, and a rich object system (S3/S4/R6), optimized for data frames, formulas, and statistical models.[2]
· CoSy draws on APL’s array-centric, symbolic operators and Forth’s concatenative style, aiming for terse, stack- and array-oriented code with a small core vocabulary and heavy reliance on composition.[1]
Ecosystem and tooling· R has CRAN, Bioconductor, and many domain-specific repositories, plus mature IDEs (RStudio, VS Code extensions), documentation standards, and literate programming via rmarkdown and knitr.[2]
· CoSy is a niche project hosted on cosy.com with language notes, experiments, and some implementation material but lacks a large package ecosystem, mainstream IDE support, or standardized distribution channels.[1]
Typical use cases· R is used for statistical modeling, machine learning, spatial analysis, time series, and reporting in academia, government, and industry; it integrates with databases, C/C++, Python, and office tools.[2]
· CoSy targets exploratory language design, ultra-concise array and stack programming, and experimentation by APL/Forth aficionados rather than production analytics or broad scientific workflows.[1]
For your use case (statistical and climate/data analysis), R is the practical workhorse, while CoSy is mainly interesting as a conceptual cousin to APL/k/J if you are exploring extreme terseness and alternative language design.[2][1]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
===
In addition, R is free, cross-platform, and has packages for just about any application.
Equally important is the free R GUI editor, RStudio. It features autocompletion, command-click to go to the function definition, and best of all … you can select a section, a line of code, part of a line, or a single term, hit “command-enter” and it runs just that part. This makes debugging so simple that I almost NEVER have to set a breakpoint. I can just step through the code and find the error.
Give it a shot, you’ll never look back.
R and RStudio are available here.
w.
Willis , thanks . I will add this to those at AI Groks CoSy .
My main
quibble is <em>expermental . I’d prefer<em>experiential</em> since the first thing I do in the morning is to tap F11 to put down a timestamp . And have for decades .A point Perp_ misses is that <a href="https://cosy.com/" target="_blank"><strong><em>CoSy</em></strong></a> is aimed at being , and is my<strong> <em>tool for the everyday business of life .</em> </strong> programming is not some special activity I go off to a special IDE like R .
If I have a quantitative thought , I just write it down as a line in my notes and tap F6 to try it out .
But there is no denying it is a raw paradigm breaker leapfrogging traditional languages .
Of course , CoSy is also free to download , and open all the way to the ( currently just WinTel x86 ) in a way R can never be .
But that offers great opportunity .
Anyway , I plan to be available o Zoom , this Saturday , Sat.Jan,20260117.1000-7.
Contact me if you’d like to connect or have any problems downloading & running CoSy .
For fun, here’s an R function I use a lot for annotating graphics. It hardwraps a line of text to a certain character width.
Enjoy,
w.
===
hardwrap=function(thelen,thetext){
require(stringi)
thetext=trimws(thetext)
if (nchar(thetext)<=thelen){
newtext=thetext
} else {
ptr=0
while(nchar(thetext)>thelen){
ptr=ptr+1
(thebreaks=stri_locate_all_fixed(thetext,” “)[[1]][,1])
(thesplit=thebreaks[max(which(thebreaks<thelen))])
thestart=thesplit+1
(textchunk=substr(thetext,1,thesplit-1))
if (ptr==1){
(newtext=textchunk)
} else{
(newtext=paste0(newtext,”\n”,textchunk))
}
(thetext=substr(thetext,thestart,nchar(thetext)))
}
if (nchar(thetext)>0) newtext=paste0(newtext,”\n”,thetext)
}
newtext
}
Excellent, I saved it.
Might be fun to start a thread to share R functions. I have literally hundreds of them that I use constantly in my programming.
w.
I’m on board! I don’t know if I have hundreds of interesting functions, but I have a few that others might be interested in. Obviously, you and I are not the only WUWT readers who are interested in R.
Start up a new post, I’ll contribute! In the meantime:
lmts=function(tser) { # applies the "lm" function to a time series if(length(grep("POSIX",class(time(tser))))>0) { tser=tser*secsperday } lm(tser~time(tser)) }And using that, this one extracts a fitted line from a time series
getfitted=function(somets) { # get fitted line from time series ts(lmts(somets)$fitted.values,start=start(somets),frequency = frequency(somets)) }This gets the trend of the time series
gettrend=function(somets) { # extract trend from time series as.double(lmts(somets)$coeff[2]) }Best to all,
w.
I’m sorry I’m not clear on just what those functions do to present equivalents . One point you make tho , is you have a large vocabulary . That’s essential for Human level work . CoSy currently has | Allwords # |>| 2450 | words from the array operations down to the x86/7 .
All the scripts building CoSy and the Reva FORTH itself CoSy is built in are open at https://cosy.com/4thCoSy/ .
As described in Simplicity , all <em>nouns </em>are <em>trees</em> , and verbs may act on sets of the leafs which themselves may be lists of characters or numbers .
<a href=”https://cosy.com/4thCoSy/Code/CoSy/math” target=”_blank”>https://cosy.com/4thCoSy/Code/CoSy/math</a>.f has a number of basic math & stat verbs .
An extremely powerful notion of Chuck Moore’s is the <em>quote</em> , <strong>'</strong> , which returns the address of the next word , even if a verb , rather than executing it — thus allowing verbs to be passed as arguments to following verbs , what Iverson would call adverbs . This allows APL level <em>power</em> to the language , specifically a number of <a href=”https://cosy.com/4thCoSy/Code/CoSy/Help/Iterators.txt” target=”_blank”><em>iterators</em></a> .
My aaply words in particular are recursive adverbs which apply a verb to the leafs of a list or corresponding leafs of pairs of lists .
As you can see , these descriptions of <strong><em>CoSy</em></strong> are much more in terms of the nature of the language as language . <strong><em>CoSy</em></strong> like <strong>APL</strong> and <strong>FORTH</strong> are truly languages in the sense that the act of programming is the defining of words which flow in sentences with existing words . You can also see why I titled SiliconValley-ForthIntrestGroup last August Breaking Paradigms is Hard to Do .
Most relevant to calculating planetary temperature I would like to see the definition in R of the Planck Distribution , in current CoSy :
: Planckl *f [ h 2. _f *f c ^2f *f ]+ $ [ h c boltz %f *f ]+ $ %f expf f1. -f %f ;
so that it will apply to entire sets of wavelengths or temperatures .
Damn ! Because this commenting widget interprets the left tick , which I use to return the following word , in some undocumented way . it has screwed up a bunch of text I don’t have time to clean up now . But I think all the links are there .
That math link should be https://cosy.com/4thCoSy/Code/CoSy/math.f .
And the ” iterators ” : https://cosy.com/4thCoSy/Code/CoSy/Help/Iterators.txt .
My primary mathematical software tool is Maple, and I was surprised to find out (today) that it has a very extensive interface with R. I’ve never used R, but will look into how Maple uses it.
I’ve got a positive vibe about Maple , too . Dan Good – The Two Greatest Lost Programming Languages [ FORTH & APL ] mentions a number of languages have been strongly influenced by APL . MATLAB is another .
I use #CoSyAnyDay on LinkedIn for posts specifically about CoSy . I started it specifically to respond to Dyalog.com #APLMonday which posted examples of APL challenges . I resorted using Claude to decipher modern APL’s syntax into essentially the read-it do-it RPN of 4th.CoSy .